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We  have  characterized  the  sequences  required  for  the 

transcriptional  regulation  of  the  F0108  human  H4  histone  gene  in  vivo. 

Recombinant  cell  lines  that  contained  deletion  constructs  of  the  H4 

promoter  region  were  prepared  in  mouse  C127  cells,  and  the  level  of 

human  H4  histone  gene  expression  was  measured  by  Si  nuclease  analysis. 

We  found  that  the  minimal  sequences  required  for  the  initiation  of 

transcription  from  this  gene  were  contained  within  the  73  nucleotides 

5'  to  the  initiation  site  of  transcription.  Within  this  region  are 

located  an  in  vivo  protein  binding  site  (Site  II),  the  GGTCC  element    - 

and  the  TATA  box.  Deletion  of  the  distal  half  of  Site  II  abolished  site 

specific  initiation  of  transcription  and  demonstrated  that  the  TATA  box 

and  GGTCC  element  were  not  sufficient  for  initiation  in  vivo.  Extension 

of  the  H4  promoter  to  -100  base  pairs  resulted  in  a  significant 

increase  in  transcription  and  this  increase  correlated  with  the 
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presence  of  an  Spl  site  in  the  proximal  half  of  the  upstream  protein 
binding  site,  Site  I.  If  the  promoter  region  was  lengthened  to  -410 
nucleotides,  there  was  a  two -fold  increase  in  the  level  of 
transcription.  Deletion  analysis  suggested  that  the  "distal-proximal" 
positive  element  was  located  from  in  the  region  from  -210  to  -330  base 
pairs  5'  to  the  cap  site.  We  investigated  the  functionality  of  a 
previously  identified  enhancer-like  element  located  very  far  upstream 
in  the  pF0116  fragment  of  A  HHG  41  and  demonstrated  that  although  it 
functioned  in  HeLa  cells  it  was  not  functional  in  mouse  C127  cell 

lines . 

SI  analysis  of  distal  deletion  constructs  supported  the  idea  that  a 
negative  regulatory  element  of  H4  gene  transcription  was  located 
between  nucleotides  -730  and  -1010.  Analysis  of  the  region 
demonstrated  consensus  sequences  for  a  topoisomerase  II  site,  nuclear 
matrix  attachment  sites,  and  a  very  high  A/T  content  (70%)  suggestive 
of  bent  DNA.  Taken  together  this  set  of  results  implied  that  the  DNA 
topology  of  this  region  might  be  important  for  H4  gene  regulation. 

Additional  studies  demonstrated  that  Alu  repetitive  sequences  in 
the  histone  deletion  constructs  could  mediate  specific  integration  into 
the  mouse  chromosome  and  that  high  copy  number  was  possible. 
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CHAPTER  I 
INTRODUCTION 


The  goal  of  this  study  has  been  to  assess  the  contribution  of 
promoter  sequences  in  the  F0108  human  H4  histone  gene  5'  flanking 
region  to  transcriptional  regulation  of  the  gene.  We  have  endeavored 
to  define  the  sequences  necessary  for  the  initiation  and  augmentation 
of  transcription.  The  TATA  box,  GGTCC  element,  "CAAT  box,"  "CCAAT  box," 
and  Spl  site  have  been  implicated  in  transcriptional  regulation  and  are 
reviewed  below.  We  have  also  investigated  a  putative  enhancer-like 
element  and  negative  regulatory  sequence  and  so  these  sequences  are 

also  discussed  below. 

Historical  Background 
The  concepts  governing  gene  regulation,  as  we  know  them  today,  have 
their  foundations  in  the  work  of  many  biochemists  and  geneticists  who 
introduced  the  ideas  of  positive  and  negative  regulation  in  prokaryotic 
gene  expression.   The  observations  of  many,  that  the  total  genetic 
potential  of  a  cell  was  never  expressed  simultaneously,  referred  to  as 
"genetic  adaptation,"  led  Jacob  and  Monod  (1961)  to  address  the 
question  of  what  controls  this  phenomenon.   In  their  seminal  paper  the 
"operon  model"  was  proposed.   This  model  described  how  structural  genes 
expressed  themselves  and  how  that  expression  was  regulated.   It  had 
been  known  for  some  time  that  bacteria  could  respond  to  various 
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nutrients  by  synthesizing  new  metabolic  enzymes,  so  Jacob  and  Monod 
investigated  the  lactose  metabolic  pathway  of  Escherichia  coll  (E. 
coli) .   Their  work  was  encouraged  by  many  earlier  investigators, 
including  Demerac  (1956),  who  made  the  observation  that  genes  coding 
for  similar  enzjnnatic  function  were  located  in  localized  regions  of  the 
Salmonella  chromosome.  Demerac  was  able  to  conclude  that  the  genes  he 
had  investigated  were  in  a  nonrandom  distribution  and  that  perhaps  this 
conferred  an  evolutionary  advantage  to  the  organism. 

The  lac  operon  is  one  of  the  most  well  studied  genetic  systems  in 
all  of  prokaryotic  and  eukaryotic  molecular  biology.   The  many 
intuitive  observations  and  predictions  of  Jacob  and  Monod  and 
colleagues  led  to  the  identification  of  the  components  of  the  lac 
operon:   the  repressor,  produced  by  the  lac  I  gene;  the  lac  operator, 
promoter,  and  three  linked  structural  genes.   The  interplay  of  inducer 
and  repressor  was  demonstrated,  and  Jacob  and  Monod  proposed  that  the 
lac  operon  was  subject  to  negative  regulation.   An  initial  observation 
of  Jacob  and  Monod  (1961)  was  that  the  control  gene  would  make 
repressors  that  would  turn  off  the  structural  genes.   The  isolation  of 
nonsense  mutations  in  the  lac  I  gene  (Bourgeois  et  al . ,  1965)  provided 
convincing  evidence  for  the  nature  of  repressors.   Suppression  of  the 
nonsense  mutation  restored  repressor  function  and  demonstrated  that 
repressor  genes  encoded  repressor  proteins.   The  final  proof  was  the 
isolation  of  the  lac  repressor  by  Gilbert  and  Muller-Hill  (1966)  .  In 
addition  it  was  demonstrated  that  the  lac  operon  and  others  were  under 
more  general  control  by  catabolite  activator  protein  and  3' 5' -cyclic- 
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AMP  as  it  was  shown  that  both  are  required,  in  addition  to  the  inducing 
molecule,  for  the  operon  to  be  transcribed  (Eramer  et  al.,  1970). 

The  ensuing  years  have  led  to  refinement  of  the  operon  model  as 
well  as  its  acceptance  as  one  of  the  general  organizational  patterns 
characteristic  of  prokaryotes.   In  particular,  the  concepts  of 
protein/DNA  interactions,  repression,  and  positive  and  negative 
regulation  have  carried  over  into  eukaryotic  molecular  biology  and 
have  served  as  a  basis  for  unraveling  the  complexity  of  the  eukaryotic 
cell.   The  extension  of  these  ideas  has  allowed  considerable  progress; 
however,  the  original  view  that  all  genes,  prokaryotic  and  eukaryotic, 
would  have  similar  regulatory  and  organizational  patterns  has  not  been 
borne  out.  In  fact  there  is  a  great  diversity  in  the  regulatory 
mechanisms  that  govern  both  prokaryotic  and  eukaryotic  gene 
expression. 

The  control  of  eukaryotic  gene  regulation  has  been  of  obvious 
interest,  but  research  has  been  slower  than  in  prokaryotes  because  of 
the  complexity  and  technical  difficulties  encountered  when  working  with 
the  eukaryotic  cell.   Two  avenues  of  study  have  predominated  in 
eukaryotic  molecular  biology:   the  investigation  of  viral  models  such 
as  adenovirus  and  SV40  (as  was  done  with  the  prokaryotic  phages  lambda 
and  T7)  and  the  characterization  of  cellular  genes  and  the  proteins 
that  regulate  their  expression. 

Eukaryotic  molecular  biologists  have  had  to  develop  the  appropriate 
technology  because  many  of  the  advantageous  prokaryotic  techniques  are 
not  directly  applicable  to  eukaryotic  systems.   Two  of  the  most 
important  discoveries  that  have  revolutionized  molecular  biology  are 


restriction  enzymes  (reviewed  by  Nathans  and  Smith,  1975)  and  DNA 
ligase  (Modrich  et  al.,  1973;  Weiss  and  Richardson,  1967).   With  these 
new  enzjnnatic  tools  the  ability  to  manipulate  DNA  fragments  developed 
quickly  and  was  responsible  for  the  present  state  of  advancement. 

Viral  Model  Systems 

The  utilization  of  viral  model  systems  for  the  characterization  of 
eukaryotic  regulatory  mechanisms  was  a  logical  extension  of  the  work 
done  in  prokaryotes.   In  particular,  adenovirus  and  SV40  have  provided 
considerable  insights  into  eukaryotic  gene  regulation.   Without  an 
understanding  of  the  exact  mechanisms  involved  in  the  various  processes 
of  RNA  transcription  and  DNA  replication,  it  was  obvious  to  early 
investigators  that  viruses,  such  as  SV40,  could  invade  and  eventually 
kill  the  host  cell  and  yet  were  extremely  dependent  on  the  cell's 
enzymatic  machinery  to  accomplish  their  replicative  cycle. 

Adenoviruses  were  first  isolated  by  Rowe  et  al .  (1953)  as  the 
agent  responsible  for  the  degeneration  of  human  adenoid  tissue  in 
culture.  The  adenovirus  life  cycle  in  human  cells  has  been  examined 
with  respect  to  the  virus -specific  proteins  produced,  replication  of 
viral  DNA,  transcription  of  viral  genes,  and  effect  on  the  host  cell 
(Reviewed  in  Tooze,  1980).   Initial  studies  demonstrated  that  there 
were  two  phases- -early  and  late- -in  the  expression  of  adenovirus  genes 
(Lindberg  et  al . ,  1972).  As  a  measure  of  the  impact  of  infection  on  the 
cell,  adenovirus  mRNA  comprises  almost  all  the  mRNA  bound  to 
polyribosomes  by  the  end  of  the  replicative  cycle  (Thomas  and  Green, 
1966)  .   The  early  viral  mRNA  was  detected  and  mapped  to  precise 
locations  on  the  adenovirus  genome  by  R-loop  mapping  (Thomas  et  al . , 


1976)  and  hybridization  to  restriction  endonuclease  fragments  of 
adenovirus  DNA  (Sharp  et  al . ,  1975).   Restriction  enzymes  permitted  the 
mapping  and  orientation  of  DNA  fragments  and  transcription  units  on  the 
SV40  genome  as  well  (Khoury  et  al.,  1973;  Sambrook  et  al.,  1973). 

Several  laboratories  utilized  adenovirus/SV40  recombinant  hybrids 
to  define  essential  genomic  regions  of  each.   In  particular,  the  hybrid 
viruses  were  useful  in  the   determination  of  the  functional  "helper" 
domain  of  the  SV40  T  antigen,  as  adenovirus  requires  "help"  to  grow  in 
nonperraissive  cells  (Fey  et  al . ,  1979).  With  the  mRNA  coding  regions 
mapped  on  the  adenovirus  and  SV40  genomes,  a  more  informative  analysis 
and  interpretation  were  initiated  which  have  begun  to  elucidate  the 
complex  nature  of  transcriptional  regulation  in  these  viruses.   The 
promoter  structure  and  presence  of  enhancing/silencing  elements  in 
these  viruses  have  served  as  continuing  models  for  studies  of  cellular 
promoters  and  regulatory  sequences.   Additionally,  although  not 
discussed  here,  both  adenovirus  and  SV40  were  utilized  in  the  discovery 
of  mRNA  splicing  (Berk  and  Sharp,  1977,  1978),  which  has  revolutionized 
our  concepts  of  gene  regulation  and  expression. 

Chromatin  Studies 

At  the  same  time  that  the  viral  model  systems  were  beginning  to  be 
reasonably  well  understood,  there  were  a  number  of  investigators 
pursuing  the  characterization  of  cellular  genes  and  transcriptional 
mechanisms.   Although  restriction  enzymes  had  been  discovered  (Smith 
and  Wilcox,  1970)  and  their  applicability  realized,  it  was  several 
years  before  their  purification  and  recombinant  DNA  technology  were 
worked  out  to  make  them  sufficiently  useful.   This  lag  did  not  deter  a 
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number  of  investigators  from  direct  examination  of  the  transcriptional 
process  in  eukaryotic  cells.   As  early  as  1962  isolated  pea  embryo 
chromatin  had  been  utilized  as  a  template  for  transcription  (Huang  and 
Bonner,  1962).   Isolated  chromatin  was  incubated  with  bacterial  RNA 
polymerase  (the  purification  of  eukaryotic  RNA  polymerases  had  not  been 
achieved  at  this  time)  and  the  four  ribonucleoside  triphosphates.   A 
comparative  analysis  of  transcription  from  chromatin  and  deproteinized 
DNA  of  the  same  source  indicated  that  the  chromatin  was  less  able  to 
support  transcription  (Huang  and  Bonner,  1962).   It  was  postulated  that 
part  of  the  chromatin  was  repressed,  perhaps  due  to  the  presence  of 
histone  proteins  bound  to  the  DNA.   The  amount  of  transcription 
possible  from  a  known  quantity  of  chromatin  was  referred  to  as  its 
template  capacity.   The  determination  of  template  capacity  in  chick 
oviduct,  a  steroid  responsive  tissue,  led  to  the  observation  that  the 
level  of  transcription  was  modulated  with  the  addition  of  hormone 
(Dahmus  and  Bonner,  1965).   The  amount  of  template  capacity  also 
correlated  with  the  various  developmental  stages  of  sea  urchin  growth 
(Johnson  and  Hnilica,  1970).   Another  more  accurate  measure  of  the 
"transcriptional  capacity"  of  a  sample  of  chromatin  was  the  number  of 
RNA  polymerase  initiation  sites.   Cedar  and  Felsenfeld  (1973)  first 
measured  the  number  of  E.  coli  RNA  polymerase  initiation  sites  on 
chromatin  by  incubating  chromatin  and  RNA  polymerase  together  in  the 
absence  of  ribonucleoside  triphosphates.   Next,  the  addition  of  the 
ribonucleoside  triphosphates  with  high  levels  of  ammonium  sulfate 
permitted  elongation  but  not  reinitiation.   One  of  the  major  criticisms 
of  this  early  work  was  that  the  use  of  bacterial  RNA  polymerase  made  an 


accurate  interpretation  in  doubt.   Comparative  studies  were  performed 
by  Mandel  and  Charabon  (1970)  and  Tsai  et  al .  (1976).   These 
investigators  demonstrated  that  there  was  no  competition  for  either 
SV40  DNA  or  calf  thymus  DNA  by  the  bacterial  or  eukaryotic  RNA 
polymerase.   However,  when  Tsai  et  al.  (1976)  compared  hen  oviduct  and 
E.  coli  RNA  polymerase  initiation  sites  on  chick  DNA  or  chick  oviduct 
chromatin,  they  found  no  competition  on  the  DNA,  but  direct  competition 
in  the  chromatin  sample.   Thus  it  appeared  that  chromosomal  proteins 
could  modify  the  initiation  specificity  such  that  both  enzymes  were 
competing  for  similar  sites.  To  establish  this  point  conclusively,  the 
product  mRNAs  had  to  be  examined.  Filter  hybridization  techniques  were 
developed  that  permitted  the  detection  of  reiterated  gene  transcripts 
and  particularly  abundant  mRNAs.  At  the  level  of  sensitivity  possible 
with  this  methodology,  in  vitro  chromatin  transcription  appeared  to 
reflect  an  accurate  view  of  the  transcriptional  status  in  vivo 
(Bacheler  and  Smith,  1976). 

The  next  major  advance  was  the  fractionation  of  chromosomal 
proteins  in  an  effort  to  reconstitute  transcriptionally  competent  DNA 
into  chromatin  In  vitro.   The  first  attempts  to  reconstitute  chromatin 
were  studies  by  Paul  and  Gilmour  (1966,  1968)  and  Bekhor  et  al .  (1969) 
in  which  they  fractionated  chromatin  proteins  in  an  attempt  to 
discover  what  group  of  proteins  controlled  transcriptional.   Their 
results  indicated  that  the  non-histone  chromosomal  protein  (NHCP) 
fraction  was  probably  responsible.   The  role  of  NHCP  in  the  expression 


of  several  genes  has  been  reviewed  (Stein  et  al.,  1974;  Simpson, 
1973). 

Experiments  became  more  refined  as  exemplified  by  the  studies  of 
Tsai  et  al.  (1976)  who  examined  the  inducible  ovalbumin  gene  in  the 
chick  oviduct  system.   The  role  of  NHCP  was  established,  and  through  a 
series  of  competition  assays  with  induced  and  uninduced  NHCPs  it  was 
demonstrated  that  in  vitro  expression  of  the  ovalbumin  gene  was 
stimulated  by  the  appearance,  upon  steroid  induction,  of  a  positive 
regulatory  factor.  Histones,  a  moderately  reiterated  family  of  genes 
(Stein  et  al . ,  1984),  were  also  studied  in  a  similar  manner  to  examine 
the  role  of  NHCPs.  Several  studies  indicated  that  NHCPs  were  involved 
in  the  increased  expression  of  the  histone  genes  during  S -phase  of  the 
cell  cycle  (Park  et  al . ,  1976;  Stein  et  al.,  1975).  Kleinsmith  et  al. 
(1976)  extended  the  characterization  and  demonstrated  that 
phosphorylation  of  the  NHCP  was  necessary  for  optimal  in  vitro 
expression  of  the  histone  genes.  When  the  NHCPs  were  treated  with 
phosphatase  before  addition  to  the  reaction,  there  was  a  decrease  in 
the  number  of  transcription  initiation  sites. 

The  role  of  the  histone  proteins  in  transcription  has  been  of  great 
interest  because  they  form  such  a  close  association  with  the  DNA. 
Studies  with  either  electron  microscopy  or  nuclease  digestion  have 
demonstrated  that  there  is  either  a  change  in  the  histone/DNA  ratio  or 
a  conformational  change  in  the  nucleosomes  associated  with  genes 
undergoing  active  transcription  (Weintraub  and  Groudine ,  1976).   The 
chromatin  structure  of  specific  genes  has  also  been  shown  to  be 
conformationally  altered  only  in  tissues  where  they  are 


expressed.  Examples  include  the  /9-globin  gene  in  chick  embryo  red 
blood  cell  nuclei  and  the  ovalbumin  gene  in  chick  oviduct  nuclei 
(Garel  and  Axel,  1976).  Also,  several  investigators  have  proposed  that 
nucleosomes  might  be  "phased"  on  the  chromosome  so  as  to  render 
particular  areas  of  the  DNA  accessible,  or  inaccessible,  to 
transcription  factors  (Gottschling  and  Cech,  1984;  Linxweiler  and  Horz, 
1985).  Thus,  at  this  juncture,  it  became  more  realistic  to  assume  that 
the  chromatin  structure  of  active  genes  in  comparison  to  silent  loci 
was  a  more  open  and  dynamic  conformation,  yet  not  necessarily  devoid  of 
histones  as  had  been  postulated. 

In  Vitro  Transcription 
During  the  early  1970s,  several  investigators  actively  pursued  the 
activity  (or  activities)  responsible  for  the  synthesis  of  the  various 
eukaryotic  mRNAs .  Almost  simultaneously  several  laboratories  were  able 
to  isolate  multiple  RNA  polymerase  activities  on  DEAE-Sephadex  columns 
(Chambon,  1975;  Roeder,  1976).   Each  peak  of  activity  exhibited  a 
different  susceptibility  to  the  inhibitor  amanitin  (Kedinger,  1970). 
There  were  differences  in  the  results  they  obtained  as  evidenced  by  the 
diverse  number  of  variant  RNA  polymerase  activities  that  were 
originally  identified  (Roeder,  1976).  As  the  purity  of  the  RNA 
polymerase  activity  increased  it  became  more  obvious  that  there  were 
three  distinct  RNA  polymerase  activities  present  in  eukaryotic  cells 
(Roeder,  1976).  It  was  very  difficult  for  early  investigators  to  make 
progress  toward  understanding  the  relationship  between  the  various 
eukaryotic  RNA  polymerases  and  their  respective  function  in  the 
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expression  of  genes,  because  adequate  templates  for  transcription  in 
vitro  were  not  available.  The  predominant  templates   used  were  either 
homopolyraers ,  bacteriophage  DNA,  or  fractions  of  genomic  DNA  enriched 
in  either  ribosomal  or  satellite  DNA  (Charabon,  1975).  These  proved 
unsatisfactory,  and  the  results  were  often  confusing.  Several  lines  of 
evidence  suggested  that  ancillary  factors  were  necessary  in  order  for 
RNA  polymerase,  in  particular  RNA  polymerase  II,  to  exhibit  template 
specific  transcription  (Chambon,  1975)  .  The  application  of  restriction 
enzymes  to  the  manipulation  of  DNA  led  to  the  cloning  of  specific  genes 
that  were  then  suitable  as  templates  for  in  vitro  transcription 
systems  (Nathans  and  Smith,  1975). 

The  biological  implications  of  the  viral  model  systems  that  had 
been  studied  in  vivo .  and  the  new  DNA  cloning  technology,  prompted 
several  investigators  to  develop  cell  free  transcription  systems.  It 
was  obvious  that  it  would  be  advantageous  to  work  with  an  in  vitro 
system  to  dissect  the  various  components  of  the  eukaryotic 
transcriptional  apparatus.  The  first  in  vitro  transcription  systems 
were  developed  for  RNA  polymerase  III,  and  shortly  thereafter,  RNA 
polymerase  II.  RNA  polymerase  III  is  responsible  for  the  synthesis  of 
5S  ribosomal  RNA  (Ng  et  al.,  1979),  tRNAs ,  and  a  few  viral  RNAs 
including  the  adenovirus  VAI  and  VAII  RNAs  (Fowlkes  and  Shenk,  1980). 
Cell  free  transcription  of  the  Xenopus  5S  rRNA  gene  by  RNA  polymerase 
III  was  first  demonstrated  by  Birkenmeier  et  al .  (1978)  in  nuclear 
extracts  of  Xenopus  oocytes .   At  the  same  time  it  was  shown  that 
cytoplasmic  extracts  of  human  KB  cells  (Wu,  1978;  Wei]  et  al . ,  1979) 
were  able  to  transcribe  selectively  cloned  5S  rRNA,  tRNA,  and 
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adenovirus  VA  RNA  genes.  The  cytoplasmic  extracts  were  shown  to 
contain  a  majority  of  the  RNA  polymerase  III  activity  (Weil  et  al., 
1979)  that  had  apparently  leaked  from  the  nucleus  during  preparation  of 
the  extract.  With  respect  to  RNA  polymerase  II,  Manley  et  al .  (1980) 
prepared  a  concentrated  HeLa  cell  extract  that  was  able  to  initiate 
transcription  accurately  in  vitro  at  a  variety  of  adenovirus  RNA 
polymerase  II  transcriptional  control  regions. 

In  vitro  transcription  was  and  is  a  powerful  technique  for  the 
investigation  of  eukaryotic  promoter  function.  The  concomitant 
development  of  various  molecular  techniques  for  the  mutation  and 
reassortment  of  DNA  sequences  was  fortuitous,  and  in  a  relatively  short 
period  of  time  the  basic  sequence  requirements  of  the  RNA  polymerase  II 
promoter  were  delineated  (Efstratiadis  et  al . ,  1980).  Although 
considerable  refinement  has  occurred  in  our  knowledge  of  these 
sequences,  the  basic  elements  have  not  changed.  One  of  the  first 
sequences  to  be  implicated  because  of  similarity  to  prokaryotic 
promoter  sequences  was  the  "TATAA"  box  (Goldberg-Hogness) .  This  A-T 
rich  stretch  is  located  -25  to  -35  bp  upstream  of  the  mRNA  start  site 
in  RNA  polymerase  II  promoters  and  is  remarkably  similar  to  the  Pribnow 
box  (TATAAT)  described  for  the  promoters  of  prokaryotic  genes  (Pribnow, 
1975).  The  only  real  difference  is  the  location  of  the  Pribnow  box, 
which  is  at  -10  bp  from  the  start  of  transcription  (Rosenberg  and 
Court,  1979).  It  should  be  noted  that  the  comparison  of  the  Pribnow  box 
with  the  Hogness  box  has  revealed  variations  in  sequence  and  some 
difference  in  function.  Principally,  the  Pribnow  box  is  absolutely 
required  for  transcription  to  occur  in  prokaryotes;  however,  as 
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discussed  below,  the  Hogness  box  is  not  as  stringently  required.  The 
second  sequence  that  has  been  retained  with  equally  remarkable 
similarity  is  the  "CAAT"  box.  The  consensus  sequence  for  this  element 
is  5'-GGCtCAATCT-3'  (Efstratiadis  et  al.,  1980;  Dynan  and  Tjian,  1985) 
and  is  usually  located  -70  to  -80  bp  from  the  mRNA  start  site. 

Although  the  TATA  box  and  CAAT  box  have  been  found  in  a  majority  of 
RNA  polymerase  II  promoters  and  appear  to  be  the  framework  around  which 
gene  specific  variations  in  regulatory  sequences  occur,  there  have  been 
some  genes  described  that  have  no  TATA  box  (Contreras  and  Fiers ,  1981; 
Melton  et  al . ,  1986;  Reynolds  et  al . ,  1984).  A  subset  of  these  genes 
that  have  instead  a  highly  G-C  rich  promoter  and  in  general  lack  the 
strict  structure  created  by  consensus  RNA  polymerase  II  sequences. 
Examples  include  enzymes  such  as  mouse  dihydrofolate  reductase  (Farnham 
and  Schimke,  1985),  hamster  3-hydroxy  3-methylglutaryl  coenzyme  A 
reductase  (Reynolds  et  al.,  1984),  and  human  phosphoglycerate  kinase 
(Singer-Sam  et  al.,  1984).  These  genes  are  often  constitutive  and  hence 
have  been  described  as  "housekeeping  genes."   Because  the  TATAA  and 
CAAT  homologies  were  found  in  many  genes,  it  was  thought  that  they 
might  function  in  the  regulation  of  transcription.  Early  in  vitro 
transcription  experiments  done  by  Wasylyk  et  al.  (1980)  indicated  that 
the  promoter  of  the  conalbumin  gene  could  be  deleted  to  -44  bp  from  the 
mRNA  start  site  without  any  effect  on  the  transcription  of  the  gene. 
However,  when  these  same  investigators  introduced  even  a  single  base 
change  into  the  TATAA  box,  there  was  a  10  fold  decrease  in  the  amount 
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of  transcription.  Similar  results  were  obtained  with  the  adenovirus  2 
major  late  control  region  (Corden  et  al . ,  1980;  Hu  and  Manley,  1981; 
Concino  et  al.,  1984). 

In  contrast  to  the  in  vitro  results,  it  was  noticed  that  the  TATAA 
box,  in  general,  was  not  essential  for  transcription  in  vivo.  Benoist 
and  Chambon  (1980)  made  an  SV40  deletion  mutant  that  lacked  the  TATAA 
box  preceding  the  early  transcription  unit.  This  mutant  was  capable  of 
synthesizing  T  antigen  and  transforming  rat  cells.  Similar  results  were 
obtained  with  the  polyoma  virus  early  transcription  unit  (Bendig  et 
al.,  1980).  It  was  also  established  that  the  TATAA  box  preceding  the 
sea  urchin  H2A  transcription  unit  was  not  necessary  for  function  in 
vivo  (Grosschedl  and  Birnstiel,  1980a).  The  deletion  mutants  that 
Grosschedl  made  were  assayed  by  injection  into  Xenopus 

oocytes.  A  54  bp  deletion  that  included  the  TATAA  box  lowered  the  level 
of  transcription  5  fold  but  did  not  abolish  activity. 

If  the  TATAA  box  is  not  absolutely  essential  in  vivo  for 
transcription,  then  what  is  the  function  of  this  highly  conserved 
sequence?  The  answer  came  from  a  series  of  SV40  early  promoter  mutants 
in  which  the  TATAA  box  was  deleted  (Gluzman  et  al.,  1980).  From  this 
set  of  mutants  it  was  demonstrated  that  in  vivo  the  initiation  of  SV40 
early  transcription  occurred  downstream  of  the  normal  site.  Also  it  was 
established  by  Gluzman  et  al.  (1980)  that  when  there  were  deletions 
between  the  start  of  transcription  and  the  TATAA  box  the  site  of 
initiation  remained  a  constant  25  bp  +  2  bp  downstream.  This 
demonstrated  that  regardless  of  the  deletion,  the  mRNA  cap  site  was 
determined  by  the  position  of  the  TATAA  box.   Grosschedl  and  Birstiel 
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(1980b)  found  that  multiple  initiation  sites  were  utilized  in  vivo 
when  the  TATAA  box  was  deleted  from  the  sea  urchin  H2A  gene.  Since  the 
lack  of  a  TATAA  box  caused  heterogeneity  in  the  start  site  of 
transcription  for  several  genes,  it  is  now  considered  that  the  TATAA 
box  functions  in  vivo  to  specify  the  correct  mRNA  initiation  site. 
Early  in  vitro  transcription  studies  did  not  directly  discern 
whether  the  CAAT  box  was  necessary  for  transcription  (reviewed  in 
Shenk,  1981).  However,  more  recent  and  detailed  studies  have  determined 
that  the  CAAT  box  does  play  a  role  in  transcriptional  regulation. 
Detailed  mutagenesis  studies  by  McKnight  and  Kingsbury  (1982);  McKnight 
et  al.  (1984)  and  Myers  et  al .  (1986)  elegantly  demonstrated  the  need 
for  the  CAAT  box.  Initially  the  studies  of  McKnight  and  Kingsbury 
(1982),  dissected  the  Herpes  Simplex  thymidine  kinase  gene  (HSVtk)  into 
discrete  areas  required  for  expression:  these  included  the  TATAA  box 
and  two  upstream  regions  referred  to  as  distal  signal  I  (dsl)  and 
distal  signal  II  (dsll) .  To  pinpoint  these  small  regions  accurately 
they  developed  a  technique  called  "linker-scanning"  mutagenesis  which 
introduces  clustered  sets  of  point  mutations  in  a  short  sequence  of 
DNA.  Specifically,  these  mutations  were  constructed  by  ligation  of  a 
series  of  complementary  3'  and  5'  deletions  joined  via  a  synthetic 
linker  (BamHI).  The  mutants  that  McKnight  and  Kingsbury  created  spanned 
the  proximal  120  bp  5'  to  the  mRNA  start  site  and  thus  they  were  able 
to  assign  a  boundary  to  all  the  sequences  required  for  HSV  tk  gene 
expression  after  microinjection  into  Xenopus  oocytes.  In  subsequent 
studies  dsl  and  dsll  of  the  HSV  tk  gene  have  been  shown  to  interact 
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specifically  with  a  cellular  protein  (Jones  et  al . ,  1985).  This 
protein,  Spl,  was  initially  purified  by  Dynan  and  Tj ian  (1983a)  from 
HeLa  cells  because  of  its  affinity  for  the  SV40  early  promoter- -later 
identified  as  the  G-C  rich  sequences  of  the  21  bp  repeats.  Once  the 
sequence  of  the  binding  site  (GGGCGG)  on  SV40  was  confirmed  by  various 
in  vitro  methods  (e.g.,  DNasel  footprinting) ,  the  purified  protein  was 
tested  for  binding  on  a  variety  of  other  genes  that  contain  a  G-C  rich 
sequence(s),  including  the  mouse  Dihydrofolate  reductase  gene  (Dynan  et 
al.,  1986)  and  more  recently  the  rat  insulin-like  growth  factor  gene  by 
Evans  et  al .  (1988).  Both  of  these  genes  contain  several  Spl  binding 
sites,  identified  in  vitro  by  DNase  I  footprinting,  and  the  sites  in 
the  rat  insulin- like  growth  factor  gene  are  of  varying  affinity- 
depending  on  the  sequence . 

Subsequent  to  the  purification  of  Spl  several  groups  reported  the 
identification  a  cellular  protein  that  interacts  with  the  CAAT  box 
sequence  and  has  been  referred  to  as  either  CAAT  box  transcription 
factor  (CTF)  by  Jones  et  al.  (1985)  or  CAAT  box  binding  protein  (CBP) 
by  Graves  et  al.  (1986).  Jones  et  al .  (1985)  demonstrated  an 
interaction  in  dsll  of  the  HSV  tk  promoter  between  Spl  and  CTF,  thus 
indicating  that  distinct  transcription  factors  may  interact  to  regulate 
expression.  The  identification  of  CTF  prompted  the  search  for  other 
putative  transcription  factors,  and  although  the  evidence  is  somewhat 
preliminary,  there  appear  to  be  at  least  3-4  different  CAAT  box 
binding  activities  depending  on  the  source  of  the  material  used  to 
purify  the  activity  and  the  criteria  used  for  analysis  (Dorn  et  al., 
1987)  .  CBP  and  CTF  differ  from  each  other  in  their  heat  stability 
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(McKnight  and  Tjian,  1986).  A  CAAT  box  binding  factor  isolated  from 
HeLa  cells  in  our  laboratory  (van  Wijnen  et  al..  1988),  termed  HiNF-B,   . 
is  yet  another  addition  to  this  growing  family  of  proteins,  with 
properties  that  distinguish  it  from  previous  isolated  CAAT  box-binding 

factors . 

The  most  sophisticated  study  to  date  on  the  subject  of 
transcriptional  regulatory  sequences  was  done  recently  by  Myers  et  al. 
(1986).  These  investigators  developed  a  quick  method  for  the 
introduction  of  single  point  mutations  in  a  small  region  of  DNA.  They 
mutated  nearly  every  base  from  -1  to  -101  bp  of  the  mouse  ^-globin 
promoter.  With  a  battery  of  over  100  clones,  each  with  a  single  base 
change  in  the  promoter,  they  were  able  to  assay  the  expression  of  the 
mutant  constructs  in  vivo  in  a  short  term  transient  assay.  Therefore, 
they  could  assign  functional  limits  to  consensus  regulatory  sequences 
and  discover  any  minor,  or  as  yet  unnoticed,  contributing  nucleotides. 
In  addition,  transversions  and  transitions  were  measured  to  assess  any 
effects  on  expression.  They  demonstrated  a  requirement  for  the  TATAA 
box  (-25  bp)  and  the  CAAT  box  (-75  bp)  as  well  as  an  upstream  sequence 
characteristic  of  the  ^-globin  genes,  CACCC  (-96  bp) ,  in  ^-globin 
transcription.  Significantly,  an  "up"  promoter  mutation  was  discovered 
when  the  two  bases.  GG,  immediately  5'  to  the  CCAAT  box  were  changed  to 
AA.  The  result  of  this  mutation  was  a  3-4  fold  increase  in  the  level 
of  message.  The  implications  of  this  result  are  that  a  CAAT  box 
transcription  factor  is  able  to  bind  more  tightly  or  more  specifically 
and  therefore  perform  its  function  more  efficiently.  With  the  number 
of  CAAT  box  binding  factors  that  are  being  found  in  various  systems,  it 
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is  also  possible  that  the  "up"  mutation  results  in  the  binding  of  an 
alternative,  as  yet  unidentified,  protein  that  carries  out  the  same 
function,  just  more  efficiently. 

In  addition,  there  are  temporal  and  tissue-specific  sequences  that 
are  found  in  the  promoters  of  some  genes  and  regulate  expression  at  the 
transcriptional  level.  Many  of  these  elements  fall  into  a  category  of 
modulatory  sequences  referred  to  as  enhancers,  negative  elements,  and 
silencers . 

Enhancers  and  Silencers 
The  promoter  of  a  gene  has  generally  been  defined  as  the  minimal 
sequences  necessary  for  the  initiation  and  maintenance  of  a  basal  level 
of  specific  transcription.  Additional  elements  that  modify  the 
expression  of  a  gene  either  during  development,  temporally,  in  a  tissue 
specific  manner,  or  as  a  result  of  an  inducer,  would  seem  a  necessity 
if  adequate  regulation  in  the  eukaryotic  cell  is  to  be  achieved.  In  the 
preceding  5-10  years  a  number  of  investigators  have  provided 
considerable  evidence  for  the  existence  of  positive  regulatory 
sequences  referred  to  as  enhancers  (Reviewed  in  Serf ling  et  al . ,  1985; 
Maniatis  et  al.,  1987).   The  properties  of  an  enhancer  are  that  1) 
there  is  strong  activation  of  the  linked  gene  from  the  correct 
initiation  site,  2)  it  exhibits  independence  of  orientation,  3)  it  is 
operative  at  long  distances  whether  3'  or  5' ,  and  4)  it  preferentially 
stimulates  transcription  from  the  closest  promoter,  if  they  are 
tandemly  arranged  (Serfling  et  al . ,  1985).  The  prototype  enhancer 
elements  are  the  72  bp  repeats  of  SV40,  which  have  been  extensively 


18 
characterized  (Benoist  and  Chambon,  1980;  Fromm  and  Berg,  1982; 
Treisman  and  Maniatis,  1985).  Several  experiments  in  which  the  SV40 
enhancer  has  been  fused  to  the  mouse  yS-globin  promoter  have 
demonstrated  the  relationships  that  exist  between  an  enhancer  and 
promoter.   Baner j i  et  al .  (1981)  demonstrated  that  the  SV40  enhancer 
could  promote  hundred-fold  higher  levels  of  rabbit  y3-globin 
transcription  whether  located  1400  or  3300  base  pairs  away.  Treisman 
and  Maniatis  (1985)  demonstrated  that  SV40  enhanced  transcription  of 
the  mouse  /3-globin  gene  depended  on  the  presence  of  a  functional 
promoter.   Point  mutations  in  the  upstream  promoter  elements  (UPE)  of 
the  /3-globin  promoter  abolished  transcription  almost  totally.   In 
conjunction  with  these  results,  Treisman  et  al .  (1985)  demonstrated 
that  when  the  /3-globin  promoter  was  deleted,  and  the  SV40  enhancer  was 
moved  to  a  proximal  position,  transcription  returned  to  a  high  level. 
It  would  then  appear  that  enhancers  are  like  promoters  but  not  vice 
versa.   Bienz  and  Pelham  (1986)  demonstrated  that  the  tandem 
duplication  of  transcriptional  control  sequences  could  result  in 
enhancing  ability.  They  found  that  the  duplication  of  a  heat  shock 
regulatory  element  (HSE)  could  function  as  an  enhancer  (distance 
activation)  whereas  a  single  HSE  was  inactive  at  a  distance.   So  one  of 
the  major  differences  between  enhancers  and  promoters  (action  at  a 
distance)  may  be  due  to  the  number  of  "promoter"  elements  present  with 
some  accompanying  specific  sequences  (Maniatis  et  al . ,  1987).  The 
importance  of  the  specific  sequences  should  not  be  down-played,  as  a 
consensus  core  sequence,  5 ' - GTGGAAAG - 3 ' ,  has  been  identified  in  viral 
and  cellular  enhancers  (Khoury  and  Gruss,  1983). 
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Differences  may  also  be  the  result  of  the  arrangement  of 
transcriptional  regulatory  sequences .   Why  do  an  increased  number  of 
regulatory  sequences  in  many  cases  stimulate  transcription  so 
dramatically?   It  has  been  proposed  that  the  resulting  protein-protein 
complexes  that  arise  from  the  juxtaposition  of  regulatory  sequences 
result  in  increased  transcription.   Therefore  since  most  enhancers 
contain  repeated  elements  it  is  possible  that  they  function  in 
organization  of  the  transcriptional  apparatus.   Exceptions  to  this 
exist  of  course;  tandem  duplication  of  the  CCAAT  box  does  not  lead  to 
a  DNA  fragment  with  enhancer  qualities  (Bienz  and  Pelham,  1986),  i.e. 
no  enhancement  at  a  distance.   Perhaps  this  result  is  also  a  reflection 
of  the  idea  that  some  "transcription"  factors  bind  to  the  DNA  but  do 
not  act  directly.  Instead  they  function  through  their  association  with 
adjacent  proteins  (Maniatis  et  al . ,  1987).  An  example  is  that  CTF  has 
been  shown  to  associate  closely  with  Spl  protein  in  the  Herpes  virus  tk 
gene  (Jones  et  al.,  1985).  Significantly,  it  has  recently  become 
apparent  that  the  mechanism  of  transcriptional  activation  by  upstream 
activation  sites  (UASs)  in  yeast  is  conserved  in  mammals.  Several 
studies  over  the  last  year  have  demonstrated  1)  that  activator 
proteins  in  yeast  are  composed  of  a  DNA  binding  domain  in  the  amino 
terminus  of  the  protein  and  a  transcriptional  activator  in  the  carboxy 
terminus,  and  2)  that  when  the  yeast  proteins  are  expressed  in 
mammalian  cells  (with  the  appropriate  binding  site  present  in  the 
promoter  of  the  target  gene)  they  can  activate  transcription  (Kakidani 
and  Ptashne,  1988;  Webster  et  al . ,  1988;  Hope  and  Struhl,  1986).  Taken 
together  with  what  is  known  about  transcriptional  regulation  in  higher 
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eukaryotes ,  it  appears  that  the  separation  of  the  DNA  binding  domain 
and  the  transcriptional  activation  domain  of  regulatory  proteins  may  be  . 
conserved  from  yeast  to  mammals.  In  addition  the  mechanism  is  probably 

conserved  as  well. 

Several  of  the  more  well  characterized  enhancer  sequences  are  part 
of  a  group  related  by  tissue  specificity  of  expression.  The 
Immunoglobulin  (Ig)   enhancer  of  the  heavy  chain  locus  is  located 
several  thousand  bps  3'  to  the  variable  region  promoter.  This  enhancer 
sequence,  in  its  entirety,  is  only  active  in  cells  of  the  lymphoid 
lineage  (Gillies  et  al.,  1983;  and  Baner j i  et  al.,  1983).  As  has  been 
found  for  the  SV40  enhancer,  the  Ig  enhancer  is  composed  of  several 
distinct  elements  that  interact  with  specific  proteins  in  vivo  (Church 
et  al.,  1985).  One  of  the  core  elements  of  the  Ig  enhancer  is  the 
"octamer"  sequence,  5' -ATGCAAAT-3' .  It  is  of  special  interest  as  it 
also  appears  in  the  promoter  of  a  few  cellular  genes,  including  histone 
H2B  (Harvey  et  al . ,  1982)  and  (2' -5')  oligo-A  synthetase  (Benech  et 
al.,  1985).  How  this  element  contributes  to  tissue  specificity  in  one 
context  (Ig  enhancer)  and  not  in  another  (histone  H2B)  remains  to  be 
determined.  Recent  in  vitro  binding  studies  of  proteins  that  interact 
with  the  SV40  "octamer"  sequence  have  demonstrated  that  there  are  both 
general  and  tissue  specific  factors  present  that  bind  this  sequence, 
and  this  may  relate  to  its  role  in  tissue  specific  regulation  (Resales 
et  al.,  1987).  Also,  careful  mapping  of  the  binding  of  HeLa  and  B  cell 
nuclear  proteins  to  the  SV40  enhancer  has  revealed  subtle  differences 
in  the  extent  to  which  various  motifs  are  protected  which  is  indicative 
of  differential  protein/DNA  interactions  (Davidson  et-  al.,  1986). 
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Enhancers  should  not  be  mistaken  for  promoters  with  additional 
sequence  attached  or  interspersed.   In  many  cases  they  exhibit 
exceptional  cell -type  and  temporal  specificity  with  respect  to 
transcriptional  activation.  Deletion  analysis  has  indicated  that 
certain  core  sequences  of  the  IgH  enhancer  may  function  in  non- lymphoid 
cells  to  shut  off  the  enhancer  action  (Wasylyk  and  Wasylyk,  1986; 
Kadesh  et  al . ,  1986) . 

The  implication  of  a  negative  regulatory  mechanism  for  the  control 
of  IgH  enhancer  action  presents  a  confusing  picture  of  tissue  specific 
and  temporal  gene  regulation.  At  first  it  was  thought  that  the  absence 
of  necessary  factors  for  enhancer  action  was  the  reason  for 
differential  activity  in  various  tissues  (Maniatis  et  al.,  1987). 
However,  this  has  been  shown  to  be  somewhat  incorrect  as  many  of  the 
factors  found  in  B  cell  extracts  are  also  in  other  tjrpes  of  cells.  So, 
it  is  either  a  case  of  inaccessibility  of  the  DNA  binding  sites  in 
nonlymphoid  cells,  or  that  there  must  be  an  interaction  with  a  B  cell 
specific  protein  (Maniatis  et  al . ,  1987).  Recently  Sen  and  Baltimore 
(1986)  discovered  a  factor  present  in  many  cell  types,  NF-kB,  that 
interacts  with  the  kappa-chain  gene  enhancer,  but  only  after 
modification  to  an  active  form  in  B-cells. 

Negative  regulation  of  gene  expression  is  an  old  subject  for 
prokaryotic  molecular  biologists,  but  is  relatively  new  to  eukaryotic 
gene  regulation.  The  first  description  of  the  SV40  enhancer  element 
caused  everyone  to  search  for  similar  elements  in  other  genes,  and  the 
identification  of  negative  regulatory  sequences,  especially  in  viral 
enhancers,  has  had  a  similar  effect.  It  is  important  to  understand  that 
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negative  regulatory  sequences  can  be  divided  into  two  groups,  1)  those 
sequences  that  shut  off  activity  of  another  regulatory  element  (such  as 
an  enhancer)  and  have  been  found  to  exist  within  the  confines  of  the 
enhancer  element,  and  2)  sequences  that  act  independently  of  other 
regulatory  elements  to  control  the  level  of  gene  expression.  This 
latter  type  of  element  is  the  newest  discovered  and  as  such  is  less 
well  characterized.  An  interesting  distinction  can  be  made  in  that 
some  negative  regulatory  elements  can  act  in  either  orientation  and 
with  some  distance  independence  and  as  such  have  been  called  either 
dehancers  or  silencers  (Baniahmad  et  al . ,  1987;  Laimins  et  al.,  1986; 
Remmers  et  al.,  1986). 

Negative  regulation  of  viral  enhancer  elements  is  best  typified  by 
the  IgH  enhancer  in  which  Wasylyk  and  Wasylyk  (1986)  have  shown  that 
sequences  on  either  side  of  the  central  core  sequence  down  regulate 
expression  in  fibroblasts  as  compared  to  B-cells.  It  is  obvious  that, 
as  mentioned  above,  there  must  be  a  mechanism  by  which  the  appropriate 
genes  are  expressed  at  the  right  times  in  the  right  tissues.  This  may 
occur  through  the  regulation  of  many  protein  factors,  but  more  likely 
there  is  one  protein  that  regulates  the  organization  of  the  other 
transcriptional  factors.  It  seems  apparent  that  the  complexity  of  the 
eukaryotic  promoter  would  in  many  cases  permit  great  specificity  of 
expression  but  could  be  a  regulatory  nightmare  for  the  cell.  An 
exquisite  example  of  coordinate  regulation  of  many  genes  is  found  in 
the  Adenovirus  system  and  the  Ela  protein.  Ela,  one  of  the  immediate 
early  proteins  produced  in  early  infection,  coordinates  the  expression 
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of  several  other  genes  (Yee  et  al.,  1987)  and  also  represses  the 
expression  of  other  elements,  such  as  the  SV40  enhancer. 

A  particularly  interesting  example  of  negative  regulation,  which 
relates  to  Ela  regulation,  has  been  described  for  embryonal  carcinoma 
cells  (EC).  SV40,  polyoma  virus,  or  Moloney  murine  leukemia  virus  are 
unable  to  express  their  genomes  when  transfected  into  undifferentiated 
EC  cells.  The  induction  of  differentiation  removes  the  block  on  the 
expression  of  both  viral  and  cellular  genes  (Gorman  et  al.,  1985). 
Mutants  of  polyoma  virus  were  isolated  that  could  replicate  in  the 
undifferentiated  EC  cells ,  and  it  was  found  that  the  mutations  occurred 
predominantly  in  the  promoter  and  enhancer  regions  of  the  early  genes. 
Alternatively,  it  has  been  found  that  the  adenoviruses  replicate  well 
in  undifferentiated  EC  cells.  In  conjunction  it  was  discovered  that 
mutants  in  the  Ela  region  could  grow  in  undifferentiated,  but  not 
differentiated  EC  cells.  Taken  together  with  previous  evidence  about 
the  function  of  the  Ela  protein,  it  has  been  suggested  that  EC  cells 
contain  an  Ela  like  protein  that  negatively  regulates  gene  expression 
until  differentiation  is  induced  (Gorman  et  al.,  1985).  Gorman  et  al . 
(1985)  have  demonstrated  that  when  the  SV40  early  promoter  is 
introduced  by  infection  it  is  inactive  in  EC  cells,  but  when  introduced 
by  CaP04  transfection  it  is  expressed  in  an  enhancer- independent 
fashion.  This  result  strongly  suggests  that  the  large  number  of 
molecules  present  in  the  transiently  transfected  cell  are  able  to 
titrate  out  the  negative  factor  (or  factors)  and  thus  allow  expression 
from  some  of  the  genomes  present.  Gorman  et  al.  (1985)  have  also  shown 
that  the  negative  factors  in  EC  cells  have  different  relative 
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affinities  for  the  various  enhancers,  and  surprisingly  the  affinity  of 
the  interaction  did  not  necessarily  relate  to  the  level  of  expression.   . 

A  number  of  cellular  genes  have  been  shown  to  contain  negative 
regulatory  elements  although  their  specific  mode  of  action  has  not  been 
characterized.  These  genes  include  mouse  )3- interferon  (Goodbourn  et 
al.,  1986),  mouse  c-myc  (Remmers  et  al.,  1986),  rat  insulin  1  gene 
(Laimins  et  al. ,  1986),  chicken  lysozyme  (Baniahmad  et  al.,  1987), 
mouse  p53  tumor  antigen  (Bienz-Tadmoor  et  al.,  1985),  chicken 
ovalbvunin  (Gaub  et  al . ,  1987),  and  rat  a-fetoprotein  (Muglia  and 
Rothman-Denes,  1986).  This  list  includes  genes  in  which  the  negative 
element  is  situated  within  an  enhancer  (mouse  ^-interferon)  and  those 
in  which  it  is  interspersed  between  other  promoter  elements  (chicken 
lysozyme  and  rat  a-fetoprotein) .  The  most  well  characterized  of  these 
are  the  chicken  lysozyme  and  mouse  ^-interferon  genes  in  which  the 
sequences  responsible  for  the  negative  effect  have  been  identified 
(Goodbourn  et  al.,  1986,  Baniahmad  et  al . ,  1987).  The  chicken  lysozyme 
gene  is  particularly  of  interest  because  it  contains  several  possible 
negative  regulatory  sequences  located  at  -0.25,  -1.0  and  -2.4  kb  from 
the  start  of  transcription  and  they  are  well  separated  from  the 
enhancer  element  identified  7  kb  upstream  (Theisen  et  al . ,  1986). 
Additionally,  it  is  interesting  that  both  the  chicken  lysozyme  and  the 
rat  insulin  1  gene  negative  regulatory  elements  are  contained  within 
repetitive  elements.  The  chicken  lysozyme  element  is  found  within  the 
CRl  repeat,  which  is  a  middle  repetitive  sequence  and  has  limited 
homology  to  the  mammalian  Alu-type  sequences.   Additionally,  the  CRl 
repeats  near  the  chicken  ovalbumin  gene  are  found  in  areas  where  there 
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is  a  change  in  the  DNasel  sensitivity  when  the  ovalbumin  gene  is 
induced,  perhaps  indicative  of  a  protein/DNA  interaction  (Stumph  at 
al .  ,  1984).  The  rat  insulin  1  element  is  a  member  of  the  family  of  long 
interspersed  rat  repetitive  sequences  (LINES)  that  are  present  in 
about  50,000  copies  per  cell  (Lairains  at  al . ,  1986).   The  fact  that 
some  of  the  negative  regulatory  elements  identified  so  far  are 
associated  with  middle  repetitive  sequences  has  attracted  attention. 
Some  investigators  have  proposed  that  the  function  of  this  arrangement 
may  be  to  coordinate  transcriptional  domains.  The  isolation  of  a  domain 
by  blocking  it  off  with  repetitive  elements  would  be  consistent  with 
the  structure  of  eukaryotic  chromatin  as  we  understand  it  today,  and 
would  allow  for  coordinate  control  of  a  gene  or  set  of  genes  of  related 
function  (Laimins  et  al . ,  1986).  Negative  regulatory  elements  are 
still  awaiting  the  identification  of  factors  that  interact  with  them 
and  characterization  of  the  protein/DNA  and  protein/protein 
interactions  that  result  in  the  negative  regulation  of  transcription. 

Histone  Genes 
His tone  proteins  have  been  known  for  a  considerable  time  and  their 
composition  has  been  the  subject  of  much  investigation  (reviewed  in 
Isenberg,  1979).  Little  was  known  however  about  the  genes  encoding 
these  acidic  proteins  until  the  late  1960s  and  the  1970s  when  many 
investigators  took  advantage  of  the  size  of  the  histone  messages,  and 
their  relative  abundance  to  investigate  the  regulation  of  this  set  of 
genes.  The  histone  genes  have  many  characteristics  that  make  them  an 
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attractive  model  system  for  the  investigation  of  regulation.  They  are 
coordinately  expressed  during  S -phase  of  the  cell  cycle,  and  this 
expression  is  the  result  of  both  transcriptional  and 

posttranscriptional  processes.  Additionally,  their  small  size  and  basic 
structure  (no  introns ,  minimal  processing)  make  them  an  easy  system  to 
manipulate  and  study  (Maxson  et  al.,  1983).  If  we  can  understand  how 
the  highly  coupled  expression  of  the  histone  genes  is  controlled, 
perhaps  we  can  then  understand  how  other  genes  are  expressed 
coordinately  and  otherwise. 

Historical  background.  One  of  the  initial  observations  regarding 
histone  proteins  was  that  they  are  present  in  a  relatively  invariant 
1:1  molar  ratio  with  DNA  in  the  cell  (Prescott,  1966).  It  was  further 
demonstrated  that  the  amount  of  histone  protein  present  in  a  cell 
doubled  during  S-phase  of  the  cell  cycle  (Bloch  et  al . ,  1967).   Such 
results  suggested  a  possible  coupling  between  these  two  metabolic 
events.   Borun  et  al.  (1967)  were  able  to  demonstrate  that  a  class  of 
polyribosomes  (7-9S)  were  selectively  enriched  during  S  phase  of  the 
HeLa  cell  cycle  and  that  they  coded  for  histone -like  polypeptides  in 
vitro,  thus  giving  more  credence  to  the  relationship  that  had  been 
demonstrated  earlier.   Borun  et  al .  also  noted  several  properties  of 
these  small  mRNAs  that  have  become  the  foundation  of  present  day 
theory  about  histone  mRNA  regulation:  1)  the  addition  of  cytosine 
arabinoside  caused  a  fourfold  increase  in  the  "histone"  mRNA 
destabilization  rate  as  compared  to  actinomycin  D  treated  cells;  2)  the 
newly  synthesized  7-9S  RNA,  at  the  Gl-S  boundary,  became  associated 
with  polyribosomes  thus  beginning  histone  synthesis;  and  3)  two  hours 
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before  the  end  of  DNA  synthesis  In  synchronized  HeLa  cells  7-9S  mRNA 
transcription  ceased  and  the  remaining  7-9S  mRNA  decayed  with 
approximately  a  one  hour  half  life.   Borun  et  al.  proposed,  somewhat 
incorrectly,  that  the  control  of  histone  mRNA  levels  was  through 
transcriptional  regulation.   The  refinement  of  molecular  techniques  has 
allowed  later  investigators  to  define  the  degree  to  which 
transcriptional  and  posttranscriptional  mechanisms  regulate  histone 
mRNA  metabolism.  Butler  and  Mueller  (1973)  repeated  and  extended  the 
results  of  Borun  by  demonstrating  several  basic  facts.   First, 
cycloheximide  was  able  to  stabilize  histone  mRNA  in  the  presence  of 
hydroxyurea,  a  potent  inhibitor  of  DNA  synthesis.   When  added  to 
synchronized  HeLa  Cells,  hydroxyurea  causes  a  very  rapid 
destabilization  of  almost  all  histone  mRNAs  (90%)  via  the  complete 
shutdown  of  DNA  synthesis  (Baurabach  et  al . ,  1984;  Heintz  et  al.,  1983; 
Sittman  et  al . ,  1983).   This  suggests  that  a  protein(s)  is  (are) 
necessary  for  the  destabilization  process  to  occur.   The  10%  of  histone 
message  that  remains  is  insensitive  to  hydroxyurea  and  probably 
represents  replication  independent  histone  gene  mRNAs  (Wells  and  Kedes , 
1985;  Wu  and  Bonner,  1982).   Second,  transcription  is  not  necessary  for 
the  production  of  this  putative  destabilization  factor  as  the  addition 
of  a  transcription  inhibitor  has  no  effect  on  the  subsequent 
destabilization  of  histone  mRNA.   Third,  Butler  and  Mueller  (1973) 
demonstrated  a  transient  increase  in  the  pool  of  free  histone  proteins 
for  20  minutes  after  treatment  with  hydroxyurea.  They  suggested  in 
their  regulatory  model  that  the  free  histone  proteins  might 
autogenously  regulate  the  translation  of  their  own  message  and/or  the 
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stability  of  the  remaining  message  following  the  cessation  of  DNA 
synthesis.  Nearly  15  years  later,  the  idea  of  autogenous  regulation 
has  gained  popularity,  since  Ross  and  coworkers  (1986,  1987)  have  so 
aptly  demonstrated  the  specific  degradation  of  histone  mRNA  in  vitro. 
and  the  isolation  of  a  nuclease  activity  that  degrades  poly  A  minus 
messages  from  the  3'  end. 

The  histone  enriched  environment  of  the  sea  urchin  genome  allowed 
for  their  early  isolation  by  equilibriuin  centrifugation  and 
subsequently  the  characterization  of  the  coding  and  spacer  region  base 
composition  (Birnstiel,  1974).   The  sea  urchin  genes  have  been 
successfully  used  as  probes  for  the  isolation  of  histone  genes  from 
several  species,  including  vertebrates  such  as  Xenopus  (Moorman  et 
al.,  1980)  and  mouse  (Seiler-Tuyns  and  Birnstiel,  1981).   The  higher 
vertebrate  histone  genes  were  then  used  to  expedite  the  isolation  of 
the  human  histone  genes  (Clark  et  al . ,  1981;  Heintz  et  al.,  1981; 
Sierra  et  al . ,  1982).  The  replication  dependent  histone  genes,  which 
comprise  the  majority  of  expressed  histone  genes,  are  characterized  by 
a  lack  of  introns  and  an  extremely  well  conserved  3'  end  sequence  that 
consists  of  an  15  bp  stem  and  loop  structure. 

Human  histone  gene  organization.  The  isolation  of  the  human  histone 
genes,  which  had  previously  been  so  intensively  studied,  permitted  the 
proposed  regulatory  hypotheses  to  be  tested.   The  organizational 
pattern  of  the  human  histone  genes  was  uncovered  by  restriction  enzyme 
analysis,  and  Southern  blot  hybridization  (Southern,  1975)  of 
restricted  phage  clones  demonstrated  that,  unlike  the  tandem  repeats  of 
the  lower  eukaryotes,  the  human  genes  were  clustered  but  had  no  obvious 
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organizational  pattern  (Sierra  et  al.,  1982;  Heintz  et  al.,  1981  and 
Clark  et  al . ,  1981).   Sierra  et  al.  (1982)  were  able  to  isolate  lambda 
Charon  4A  phage  clones  representative  of  three  families  or  clusters. 
Unlike  the  lower  eukaryotic  organization,  none  of  these  clustered 
groups  of  human  histone  genes  contained  a  human  HI  gene.  By  using  a 
chicken  HI  specific  probe  Carozzi  et  al.  (1984)  isolated  a  clone  that 
had  all  5  human  histone  genes  including  an  HI  histone.  Recently, 
several  human  histone  genes  have  been  localized  to  different 
chromosomes  (Triputti  et  al.,  1986,  Green  et  al . ,  1986).   This 
suggests  that  coordinate  control  of  human  histone  gene  expression  might 
not  be  as  easily  regulated  as  in  lower  eukaryotes. 

Another  question  that  had  not  been  addressed  up  to  this  time  was 
whether  different  histone  mRNAs  were  the  product  of  different  histone 
genes.   Lichtler  et  al .  (1982)  demonstrated  convincingly  that  seven 
species  of  human  H4  histone  mRNA  were  encoded  by  at  least  3  separate 
genes,  thereby  establishing  that  the  human  histone  genes  are  a 
repetitive  family  of  genes,  but  not  redundant.  Lichtler  et  al.  (1982) 
also  strengthened  the  possibility  that  different  histone  genes  might  be 
subject  to  diverse  regulation  since  it  was  obvious  that  certain  H4 
mRNAs  were  present  at  higher  levels  than  others. 

Transcriptional  and  Posttranscriptional  regulation.  Our  knowledge 
about  these  two  steps  in  the  regulation  of  histone  mRNA  metabolism  has 
been  strengthened  by  the  studies  of  Heintz  et  al.  (1983);  Sittman  et 
al.  (1983)  and  Plumb  et  al .  (1983a, b).  Plumb  et  al.  (1983b)  utilized 
HeLa  cells  synchronized  by  double  thymidine  block  and  hybrid  selection 
of  pulse  labelled  histone  mRNA.  This  technique  permitted  several 
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species  of  histone  mRNA  to  be  isolated  on  acrylamide  gels.   These 
experiments  demonstrated  that  the  histone  genes  are  transcribed  in  the 
$--'  early  part  of  S-phase,  approximately  2-3  hours  post  release  from  double 

thymidine  block.   The  increase  in  the  histone  mRNA  transcription  was  3- 
5  fold  during  this  period.  Baumbach  et  al .  (1987)  demonstrated  a 
similar  increase  in  the  level  of  histone  gene  transcription  at  the 
beginning  of  S-phase  with  nuclear  run-on  analysis.  However,  one  of  the 
anomalies  of  histone  gene  expression  is  that  if  one  follows  the  total 
increase  in  the  amount  of  histone  mRNA,  the  actual  elevation  is  from 
10-25  fold  (Plumb  et  al . ,  1983b;  Heintz  et  al . ,  1983).   The  actual 
differences  in  histone  mRNA  levels  have  varied  from  one  report  to 
another  and  this  is  probably  the  result  of  the  various  synchronization 
and  analysis  techniques  utilized.  Conservatively,  the  level  of 
transcription  increases  3  fold  during  the  first  2-4  hours  of  S  phase, 
and  the  stability  of  histone  mRNA  rises  10-20  times  during  S-phase. 
Outside  of  S  phase  or  after  the  artificial  cessation  of  DNA  synthesis 
by  drug  treatment,  the  half- life  of  histone  mRNA  is  approximately  10-15 
mins.  (Sittman  et  al . ,  1983;   Plumb  et  al . ,  1983a). 

Nuclease  sensitivity  and  Protein/DNA  interaction.  Historically,  a 
hallmark  of  an  active  gene  has  been  the  presence  of  nuclease 
hypersensitive  sites  in  the  promoter  region  of  the  gene.  Chrysogelos  et 
al.  (1985)  and  Moreno  et  al .  (1985)  have  extensively  characterized  the 
nuclease  sensitivities  of  the  flanking  and  coding  regions  of  the  F0108 
human  H4  histone  gene.  Together,  their  results  demonstrate  that  the  5' 
region  of  the  F0108  H4  gene  is  a  dynamic  area  of  varying  sensitivity  to 
DNase  I,  micrococcal  and  SI  nuclease.  Since  the  histone  genes  are  cell 
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cycle  regulated  with  respect  to  transcription  and  total  message  levels, 
Chrysogelos  et  al .  (1985)  were  able  to  correlate  the  size  of  the  DNase 

I  hypersensitive  site  with  the  stage  of  the  cell  cycle.  As  mentioned 
earlier,  the  appearance  of  a  DNasel  hypersensitive  site  is  indicative 
of  protein/DNA  interactions  in  the  region.  Paul!  et  al .  (1987) 
utilized  the  technique  of  genomic  sequencing  to  visualize  the  in  vivo 
protein/DNA  interactions  in  the  promoter  of  the  F0108  human  H4  histone 
gene.  They  demonstrated  that  there  are  two  binding  sites  in  the 
proximal  promoter  region  which  have  been  designated  Site  I  (-122  bp  to 
-89  bp)  and  Site  II  (-64  bp  to  -23  bp) .  Site  I  contains  a  putative  Spl 
site  and  a  possible  CAAT  box.  Site  II  contains  the  GGTCC  element  (see 
below)  and  the  TATAA  box.  The  protein/DNA  complexes  at  Site  I  and  Site 

II  are  present  throughout  the  cell  cycle  and  presumably  these 
interactions  in  the  promoter  region  are  involved  in  the  basal  and 
increased  level  of  transcription  demonstrated  at  the  onset  of  S -phase. 
Perhaps  the  interactions  that  regulate  the  level  of  transcription  at 
the  start  of  S-phase  occur  through  protein/protein  interactions  since 
there  is  no  apparent  change  in  the  protein/DNA  interactions  during  the 
cell  cycle.  In  studies  done  by  Heintz  and  Roeder  (1984),  it  was 
demonstrated  that  the  pHuH4  histone  gene  was  transcribed  in  vitro  to  a 
greater  extent  in  S-phase  extracts  than  in  G-phase  extracts.  It  would 
be  important  to  know  whether  there  is  a  new  protein  that  appears  at 
the  onset  of  S-phase  that  acts  either  directly  to  augment 
transcription  by  interacting  with  the  DNA  or  through  a  protein/protein 
interaction.  Since  the  identification  of  protein/DNA  interactions  in 
the  promoter  of  the  F0108  H4  gene,  it  has  been  of  great  interest  to 
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us  to  ascertain  if  there  is  any  functionality  in  the  interaction  and 
this  is  addressed  to  some  extent  in  this  work. 

Other  histone  genes,  from  a  variety  of  species,  have  been 
characterized  with  respect  to  the  contribution  of  5'  flanking  sequences 
in  transcriptional  regulation.  Notably,  the  human  H2B  gene  has  been 
extensively  characterized  with  in  vitro  transcription  by  Sive  et  al. 
(1986).  They  demonstrated  that  the  transcription  of  the  H2B  gene  is 
dependent  on  a  number  of  sequences  5'  to  the  TATA  box  including  the  H2B 
octaraer  element  and  CCAAT  box.  Recently,  the  emphasis  has  been  placed 
on  identification  of  the  sequences  responsible  for  the  periodic 
increase  in  histone  gene  transcription  during  the  cell  cycle. 
Artishevsky  et  al .  (1987)  have  demonstrated,  although  not  convincingly, 
that  the  sequences  responsible  for  the  S -phase  increase  in 
transcription  of  a  hamster  H3  gene  are  located  in  the  proximal 
promoter  region  (-150  bp) ;  however  they  were  not  explicitly  defined. 
The  authors  propose  that  this  region  of  the  hamster  H3  gene  bears 
similarity  to  the  sequence,  5' -GCGAAA-3' ,  that  has  been  shown  to 
regulate  the  cell  cycle  expression  of  the  HO  genes  of  yeast  (Nasmyth, 
1985).  Taken  as  a  whole,  these  many  results  support  the  idea  that  the 
histone  genes  are  controlled  at  the  transcriptional  level  by  promoters 
that  are  composed  of  many  elements  that  interact  with  different  and 
specific  proteins.  Though  not  dealt  with  here,  van  Wijnen  et  al. 
(1987,  1988)  have  shown  that  the  promoter  region  of  several  cloned 
human  histone  genes  can  interact  with  nuclear  proteins  in  a  specific 
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«:;pqt.P.Tine  analysis.  Only  a  few  histone  genes  have  been  sequenced 
extensively  enough  to  permit  a  comparative  analysis  of  5'  flanking 
sequences.   The  majority  of  sequencing  information  concerning  histone 
genes  has  revolved  around  the  coding  sequences.  Comparative  analysis  of 
these  protein  sequences  has  revealed  remarkable  homogeneity  from 
species  to  species,  especially  with  respect  to  histones  H3  and  H4 
(Wells,  1986).  Unfortunately  little  5'  flanking  sequence  for  H4  histone 
genes  has  been  published,  and  most  sequences  extend  only  80-120 
nucleotides  upstream  (Wells,  1986).   A  comparison  of  the  F0108A  H4 
histone  gene  (Sierra  et  al.,  1983),  which  my  studies  have  involved, 
and  the  human  H4  histone  gene  independently  isolated  by  Heintz  et  al . 
(1981),  suggests  that  some  of  the  sequences  in  the  5'  proximal  promoter 
region  are  conserved- -the  TATA  and  GGTCC  boxes.  The  TATA  box  is,  of 
course,  a  canonical  RNA  polymerase  II  transcription  sequence  and  the 
GGTCC  box  has  been  associated  with  many  H4  gene  promoters  from  sea 
urchin  to  human  (Hentschel  and  Birnstiel,  1981,  Wells,  1986). 
Comparison  of  the  F0108  gene  to  the  mouse  H4  gene  isolated  by  Seiler- 
Tuyns  and  Birnstiel  (1981)  reveals  extensive  similarity  between  the 
promoters,  especially  the  TATA  box,  GGTCC  element,  and  the  CAAT 
sequence  that  is  found  as  either  a  single  or  double  copy  located  just 
5'  to  the  GGTCC  element  in  many  H4  histone  genes  (Wells,  1986).   The 
significance  of  the  H4  "CAAT"  sequence  is  somewhat  questionable  as  it 
was  originally  thought  to  represent  a  the  "CCAAT"  box  that  is 
associated  with  many  RNA  polymerase  II  promoters.  There  have  been 
several  CCAAT  box  factors  isolated,  and  all  of  them  require,  for  good 
binding,  the  sequence  5'-CCAAT-3'  (Dorn  et  al.,  1987)-.  The  H4 
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histone  gene  with  which  we  are  working,  F0108 ,  does  have  two  CCAAT 
boxes  located  several  hundred  basepairs  upstream  and  the  possible 
functionality  of  both  the  proximal  CAAT  boxes  and  the  distal  CCAAT 
boxes  is  discussed  in  the  work  presented  here. 

The  functionality  of  these  and  other  sequences  in  the  promoter  of 
histone  genes  has  been  one  of  the  focuses  of  our  work.   Also,  the 
Heintz  and  Roeder  laboratory  have  investigated  the  functionality  of 
promoter  sequences  in  the  human  H4  gene  they  isolated.   In  vitro 
transcription  analysis  of  Bal  31  deletion  mutants  of  the  F0108  H4  gene 
by  Sierra  et  al.  (1983)  demonstrated,  in  whole  cell  extracts,  that 
promoter  sequences  could  be  deleted  to  within  50  bp  of  the  cap  site 
without  loss  of  transcription.   These  sequences  include  only  the  TATA 
box  and  GGTCC  element,  but  are  apparently  sufficient  for  accurate  in 
vitro  transcription  to  occur.   In  vitro  transcription  analysis  by  Hanly 
et  al.  (1985)  demonstrated  very  similar  effects.   When  only  the  TATA 
box  remained  as  the  sole  RNA  polymerase  II  consensus  element, 
transcription  was  accurate  but  at  a  reduced  level.   Hanly  et  al. 
(1985)  have  suggested  that  the  sequences  extending  to  -110  bp  are 
sufficient  for  maximal  transcription  of  the  human  H4  histone  gene  in 
vitro. 

The  analysis  of  histone  gene  transcription  in  vitro  has  contributed 
to  our  understanding  of  the  minimal  requirements  for  5'  sequence; 
however,  it  has  been  demonstrated  previously  that  the  requirements  for 
initiation  of  mRNA  synthesis  in  vitro  and  in  vivo  are  different  in  many 
instances.  One  might  reasonably  assume  that  the  chromatin  structure  of 
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an  integrated  gene  would  affect  its  regulation  and  intrinsic 
accessibility  to  regulatory  proteins.  We  felt  it  was  necessary  to 
extend  these  in  vitro  studies  into  stable  cell  lines  for  the  reasons 
outlined  above  and  discussed  in  Materials  and  Methods  (Chapter  2) .   A 
logical  extension  of  many  in  vitro  studies  has  been  to  manipulate  the 
promoter  or  coding  region  of  a  gene  in  vitro  and  to  replace  it  in  vivo 
and  hopefully  measure  the  affect  of  the  manipulation  on  expression. 
Perhaps  this  has  been  most  successfully  accomplished  in  yeast,  where 
the  reintroduction  of  the  manipulated  gene  can  be  done  with  precision 
into  the  exact  locus  from  which  it  came  originally  (Szostak  et  al . , 
1983)  .  This  is  a  goal  shared  by  many  molecular  biologists  as  it  would 
be  a  more  accurate  way  to  assess  structure/function  relationships. 

Histone  genes  have  been  transiently  expressed  in  a  number  of 
different  cell  types  (Kroeger  et  al.,  1987;  Capasso  and  Heintz,  1985; 
Green  et  al . ,  1986;  Bendig  and  Hentschel,  1983;  Marashi  et  al..  1986). 
The  transient  assay  affords  a  reasonably  quick  way  to  examine  the 
effects  of  DNA  manipulation.  The  results  have  suggested  that 
heterologous  or  homologous  systems  can  be  used  to  express  transfected 
genes.  In  probably  one  of  the  more  radical  transfection  experiments, 
Bendig  and  Hentschel  (1983)  introduced  the  embryonic  histone  gene 
repeat  of  the  sea  urchin  Psammechinus  miliaris  transiently  into  HeLa 
cells.  Correct  5'  mRNA  start  sites  were  detected  for  all  5  genes  of  the 
cluster,  but  the  termination  of  transcription  was  generally  aberrant 
with  the  exception  of  the  H2B  gene.  This  set  of  results  is  suggestive 
that  heterologous  systems  may  share  many  regulatory  components  that 
allow  them  to  transcribe  foreign  genes  correctly,  but  may  have --in 
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this  case  3'  processing- -parts  of  the  regulatory  machinery  that  are 
incompatible.  This  particular  subject  is  discussed  in  the  work 
presented  here.  At  the  point  where  our  work  began,  the  only  stable  cell 
lines  created  with  an  integrated  human  H4  histone  gene  were  by  Capasso 
and  Heintz  (1985).   They  utilized  one  construct,  pHuH4 ,  to  assess  the 
level  of  H4  histone  gene  regulation  in  mouse  Ltk"  cells.   In  vivo  SI 
nuclease  analysis  of  this  single  construct  permitted  them  to  conclude 
that  mouse  cells  could  accurately  transcribe  the  human  H4  gene.   Green 
et  al.  (1986)  demonstrated  that  the  F0108  hToman  H4  histone  gene  was 
expressed  in  mouse  C127  lung  fibroblasts.   In  these  experiments  the 
F0108  gene  was  carried  episomally  on  a  construct  made  from  the  69% 
transforming  fragment  of  Bovine  papilloma  virus . 

With  this  understanding  and  background  we  initiated  studies  with 
the  human  H4  histone  gene  F0108  (Sierra  et  al.,  1982)  to  ascertain  the 
in  vivo  functionality  of  sequences  in  the  5'  promoter  region. 


CHAPTER  2 
MATERIALS  AND  METHODS 

Experimental  rationale  ^nd  commentary.  Of  particular  importance, 
for  histone  and  other  eukaryotic  genes,  is  the  identification  of 
regulatory  sequences  and  molecules  that  mediate  transcriptional 
control.  Several  laboratories,  including  our  own,  have  conducted  in 
vitro  and  in  vivo  experiments  to  assess  the  functionality  of  the 
histone  gene  coding  region  and  flanking  sequences  in  the  regulation  of 
expression  (van  Wijnen  et  al . ,  1987;  Sierra  et  al . ,  1983;  Heintz  et 
al.,  1983;  Pauli  et  al.,  1987;  Dailey  et  al . ,  1986;  Green  et  al., 

1986). 

We  felt  that  an  in  vivo  approach,  via  the  introduction  of  modified 
genes  by  transfection,  had  the  advantage  that  the  integrated  gene  was 
packaged  as  chromatin  and  presumably   transcription  factors,  such  as 
RNA  polymerase  II,  CTF,  and  Spl  were  present  in  proper  and  localized 
concentrations  due  to  the  structural  integrity  of  the  nucleus. 
Therefore  the  results  would  be  a  more  accurate  reflection  of  the  actual 
in  vivo  situation.  The  results  were  still  cautiously  interpreted  in  the 
context  of  the  experimental  parameters  present,  such  as  copy  number. 
Some  of  our  experiments  have  been  done  in  a  transient  assay  system  and 
the  expression  of  the  human  H4  gene  under  these  conditions  was  somewhat 
different  than  when  stably  integrated.  Presumably  there  were 
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differences  in  chromatin  structure  and  factor  to  DNA  ratios  and  this 
may  have  been  reflected  in  the  results.  Previous  work  has  demonstrated 
that  the  human  H4  histone  gene,  with  which  we  have  worked,  has  a 
defined  chromatin  structure  that  includes  an  extensive  DNasel 
hypersensitive  site,  and  that  this  site  fluctuates  in  size  during  the 
cell  cycle,  which  may  be  the  result  of  the  interaction  of 
transcriptional  control  factors  (Chrysogelos  et  al.,  1985). 

An  in  vivo  experiment  with  a  transfected  gene  requires  an  assay  and 
experimental  approach  that  will  allow  for  the  detection  of  the 
introduced  gene.  Several  options  were  available  for  us  to  pursue.  The 
most  commonly  used  have  been  1)  the  promoter  of  a  gene  was  linked  to  a 
reporter  gene  such  as  chloramphenicol-acetyl-transferase  (CAT)  (Gorman 
et  al . ,  1982),  or  2)  the  whole  gene,  coding  and  flanking  regions,  was 
introduced  into  a  heterologous  environment  (e.g.  a  human  gene  into  a 
mouse  cell)  (Capasso  and  Heintz,  1985,  Marashi  et  al.,  1986).  Several 
groups,  including  our  own,  have  utilized  such  heterologous  systems 
because  they  allow  for  the  easy  detection,  by  SI  nuclease  analysis,  of 
the  mRNA  of  interest  with  little  or  no  background.  We  decided  that  it 
would  be  better  to  leave  the  H4  promoter  attached  to  the  H4  gene  and 
express  these  constructs  in  mouse  cells. 

The  histone  constructs  we  cotransfected  with  the  pSV2neo  plasmid 
were  expressed  and  detectable  with  SI  nuclease  analysis  in  mouse  cells. 
We  realized  that  the  histone  promoter  deletion  constructs  could  be 
compared  to  one  another  and  the  differences  in  the  steady  state  level 
of  histone  mRNA  from  one  construct  to  another  were  a  direct  reflection 
of  transcription.  We  concluded  this  because  the  coding  region  of  all 
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the  constructs  had  remained  intact.   Messenger  RNA  turnover  was 
presumably  the  same  for  each  construct  and  any  differences  in  the 
steady- state  level  of  histone  mRNA  were  therefore  a  result  of 
transcription. 

We  included  a  mouse  H4  control  in  each  of  our  SI  nuclease  assays  to 
permit  the  quantitation  of  the  total  amount  of  mRNA  and  particularly 
the  amount  of  histone  mRNA.  In  retrospect,  this  has  helped  us  to 
understand  more  about  the  interaction  of  transcription  factors  with  the 
H4  histone  genes  and  in  some  cases  has  been  an  adequate  internal 
control.  Because  of  the  competition  phenomenon  we  uncovered  (described 
in  Chapter  4)  the  mouse  H4  became  a  less  than  perfect  internal  control. 
Originally  we  tried  to  incorporate  the  mouse  18S  ribosomal  RNA  gene 
into  our  SI  nuclease  assay  but  were  unable  to  find  adequate 
hybridization  conditions  for  both  histone  and -ribosomal  probes.  Ideally 
another  mouse  histone  gene  in  conjunction  with  the  mouse  H4  should  have 

been  used. 

Materials  and  general  laboratnrv  procedures.  All  chemicals  were  of 
the  highest  quality  available.  Phenol  was  redistilled  and  stored  frozen 
with  the  addition  of  0.1  %  (w/v)  8-hydroxyquinoline  at  -20°C.  The 
frozen  phenol  was  equilibrated  first  with  100  mM  Tris-HCl  (pH  8.0)  and 
subsequently  with  10  mM  Tris-HCl  and  1  mM  EDTA   (pH  8.0)  until  the  pH 
was  between  6.0  and  7.0.  Phenol/Chloroform  extraction  refers  to  the 
addition  of  one  volume  of  equilibrated  phenol  and  one  volume  of 
Chloroform/isoamyl  alcohol  (24:1)  to  a  solution,  mixing,  and 
separation  of  the  phases  by  a  brief  centrifugation  step.  Next,  at  least 
one  volume  of  chloroform/isoamyl  alcohol  is  added  and  the  above 
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centrifugation  step  repeated.  Hereafter  precipitation  refers  to  the 
addition  of  2-3  volumes  of  95%  ethanol,  1/lOth  volume  3M  Sodium  Acetate  . 
(pH  5.0),  to  a  solution  of  DNA  or  RNA.  This  was  subsequently  placed  at 
-20  or  -70°C  for  a  sufficient  time  to  allow  precipitation  of  the 
nucleic  acids.  Radioactively  labelled  nucleotides,  [7--''^P]ATP  (-  600 
Ci/mmol)  and  [Q--^^P]dCTP  (-  3000  Ci/mmol)  ,  were  purchased  from  Amersham 
and  ICN.  X-ray  film,  Cronex  and  XAR-5,  were  obtained  from  Dupont  and 
Eastman  Kodak  respectively.  For  all  experiments  that  involved  RNA  the 
solutions  were  pretreated  with  0.01%  diethylpyrocarbonate  (DEPC)  and 
glassware  was  treated  with  0.1%  DEPC.  After  a  30  min.  treatment  the 
solutions  and  glassware  were  autoclaved  for  thirty  minutes  to  remove 
any  traces  of  DEPC . 

Plasmid  growth  and  preparation.  L-broth  (Maniatis  et  al . ,  1982)  was 
prepared  by  mixing  lOg/1  Bacto  tryptone  (Difco) ,  5  g/1  yeast  extract 
(Difco),  5  g/1  NaCl.and  2  ml/1  IM  NaOH  in  1  L  of  ddH20  (double 
distilled  water).  The  medium  was  then  autoclaved  for  30  min.  in  order 
to  sterilize  it.  Ten  milliliter  starter  cultures  of  bacteria  were 
prepared  in  sterile  conical  tubes  and  grown  overnight  at  37° C.  These 
were  supplemented  with  sterile  20%  glucose  (100  /xl)  ,  IM  MgS04  (10  fil)  , 
and  50  pg/ml  ampicillin  (Sigma).  Small  inocula  were  removed  from 
glycerol  stocks  or  colonies  were  picked  from  plates  and  placed  in  the 
starter  culture  overnight.  Large  scale  (500  ml)  preparations  were  then 
completed  with  5  ml  20%  glucose,  0 . 5  ml  MgS04  and  50  ^g/ml  ampicillin. 
Cultures  were  grown  at  37 °C  until  they  reached  an  optical  density 
(595nm)  of  0.4  to  0.5.  At  this  point  4.25  ml  of  20  mg/ml 
chloramphenicol  were  added  and  the  cultures  were  allowed  to  grow  for  an 
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additional  16-18  hrs .  If  the  bacteria  contained  a  pUC  plasmid  or 
derivative,  the  amplification  step  was  omitted.  The  cells  were 
harvested  and  the  plasmid  DNA  was  prepared  essentially  as  described  by 
Maniatis  et  al.  (1982).  The  pellet  was  resuspended  in  10  ml  of 
Solution  1  (50  mM  glucose,  25  mM  Tris-HCl  pH  8.0,  10  mM  EDTA,  and  5 
mg/ml  lysozyrae  (Cooper  Biomedical))  and  incubated  at  room  temperature 
for  5  min.  Next,  20  ml  of  Solution  2  (0.2  N  NaOH,  1%  SDS)  was  added  and 
the  cells  were  placed  on  ice  for  10  min.  Fifteen  ml  of  Solution  3  (5M 
KAc,  pH  4.8)  was  added  and  incubated  on  ice  for  10  min.  The  cells  were 
then  centrifuged  at  10k  rpm  for  20  min.,  4°C.  The  supernatants  from  all 
tubes  were  pooled  and  precipitated  with  0.6  volume  of  isopropanol  for 
15  min.  at  room  temperature.  The  precipitate  was  recovered  by 
centrifugation  at  10k  rpm  for  30  min.  The  pellet  was  dried  and 
resuspended  in  8  ml  of  10  mM  Tris-HCl  pH  8.0,  1  mM  EDTA  (TE) .  Eight 
grams  of  CsCl  and  640  /^l  of  10  mg/ml  ethidium  bromide  were  added  and 
the  preparation  was  centrifuged  for  36  hrs  at  45k  rpm  in  Beckman  heat 
sealed  tubes  in  a  Beckman  Ti50  rotor.  The  DNA  band  was  visualized  by 
ultraviolet  illumination  and  recovered  by  side  puncture  with  a  20 
gauge  hypodermic  needle.  The  DNA  was  then  either  placed  over  a  small 
Dowex  AG  50W-X8  column  or  butanol  extracted  5X  to  remove  the  ethidium   - 
bromide.  The  sample  was  then  dialyzed  extensively  against  TE.  The  DNA 
was  recovered  by  ethanol  precipitation  and  subsequent  centrifugation. 
Quantitation  of  the  yield  was  done  spectrophotometrically  (Beckman)  at 
260  nm. 

Plasmid  preparation  with  TB.  The  method  is  similar  to  that  outlined 
above  for  L- Broth  except  that  the  TB  medium  was  used.  TB  was  prepared 
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as  described  by  Tartof  and  Hobbs  (1987).  Bacto  tryptone  (6.65  gr.), 
13.3  gr.  of  yeast  extract,  and  2.2  ml  of  glycerol  were  prepared  in  450 
ml  of  ddH20.  The  medium  was  sterilized  in  the  autoclave  for  30  rain.  To 
the  sterile  solution  was  added  55.5  ml  of  sterile  0.17M  KH2PO4,  0.72M 
K2HPO4.  This  medium  was  inoculated  and  bacteria  were  grown  as  above. 
Because  the  medium  is  very  rich,  the  yields  were  often  large  so 
bacteria  that  contained  pBR322  plasmids  were  not  induced  with 
chloramphenicol.  The  DNA  was  prepared  by  the  same  method  except  that 
the  original  volume  of  cells  was  split  into  two  aliquots  at  the 
beginning  of  the  isolation  procedure.  This  was  found  to  be  essential 
and  greatly  facilitated  lysis  and  subsequent  isolation  of  the  plasmid 
DNA.   For  comparative  purposes,  500  ml  of  TB  can  produce  4-5  mg  of 
total  plasmid  DNA  in  comparison  to  1  mg  with  L-Broth  with 
amplification. 

Production  of  unidirectional  deletions  with  Exonuclease  III.   This 
method  was  carried  out  essentially  as  described  by  Stratagene  (San 
Diego,  CA)  from  which  the  reagents  were  purchased.  The  method  takes 
advantage  of  the  fact  that  Exonuclease  HI  cannot  digest  3'  single 
strand  overhangs.  For  our  purposes  the  pF0005  insert  was  cloned  into 
the  Pstl/Hindlll  sites  of  Bluescript  M13+.  The  Hindlll  site  is  adjacent 
to  an  Apal  site  in  the  vector.  To  produce  the  deletions  in  which  we 
were  interested,  the  pF0005  Bluescript  clone  was  digested  with  Hindlll 
(5'  overhang)  and  Apal  (3'  overhang)  to  completion.  We  then  mixed  three 
fig   of  digested  DNA,  25  /zl  of  2X  Exonuclease  III  buffer  (100  mM  Tris-HCl 
pH  8.0,  10  mM  MgCl2,  20  pg/ml  tRNA) ,  5  /j1  of  freshly  prepared  200  mM  2- 
mercaptoethanol ,  30  units  of  Exonuclease  III,  and  enough  ddH20  to  make 
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the  final  volume  50  fxl.    The  reaction  conditions  were  established 
through  a  series  of  titration  experiments  to  determine  the  extent  of 
deletion  with  time.  After  the  addition  of  the  enzyme  (added  last)  10  pil 
aliquots  were  removed  every  minute  for  5  min. ,  diluted  with  80  pi  IX 
Mung  Bean  nuclease  buffer  (5X  =  150  mM  NaOAc ,  pH  5.0,  250  mM  NaCl,  5 
mM  ZnCl2,  25%  glycerol)  and  heated  to  68°C  for  15  min.  Once  the 
deletion  reactions  had  been  stopped  9  units  of  Mung  Bean  nuclease  in 
dilution  buffer  (IX  =  10  mM  NaOAc,  pH  5.0,  0.1  mM  ZnOAc ,  1  mM  cysteine, 
0.001%  Triton  X-100,  50%  glycerol)  were  added  and  the  reaction  allowed 
to  proceed  at  30°C  for  30  min.  The  reaction  was  stopped  by  the  addition 
of  100  /j1  of  phenol/chloroform  and  extracted.  The  aqueous  layer  was 
removed  and  precipitated  with  10  pi  of  3  M  NaOAc  pH  7.0  and  2.5  volumes 
of  95%  ethanol.  The  DNA  was  recovered  by  centrifugation,  ligated  and 
transfected  as  described  below.  This  procedure  worked  very  poorly  and 
resulted  in  very  few  positive  clones.  The  deletions  that  were  obtained 
were  characterized  by  run-off  transcription  from  the  T3  promoter  of 
each  clone.  The  DNA  was  digested  with  Ncol  and  transcription  reactions 
carried  out  exactly  as  described  by  Stratagene.  The  transcripts  were 
electrophoresed  on  a  6%  acrylamide ,  8 . 3M  urea  gel  and  the  extent  of 
deletion  determined  by  comparison  to  run- off  transcription  from  the 
parental  construct  pF0005BS. 

DNA  Fragment  Elution.  After  restriction  enzyme  digestion  DNA 
fragments  were  usually  electrophoresed  in  low  percentage  agarose  gels 
(0.7  to  1.0%)  with  IX  TBE  (lOX  =  500  mM  Tris-HCl  pH  8.3,  500  mM  boric 
acid,  10  mM  EDTA)  and  visualized  by  long  wave  ultraviolet  illumination 
of  the  ethidium  bromide  stained  band  (2  pg/ml  for  15  min.).  The  band  of 
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interest  was  excised  from  the  gel.  The  Fragment  Eluter  (IBI)  was  first 
run  for  30  min.  with  low  salt  buffer  (20mM  Tris-HCl,  pH  8.0;  5  mM  NaCl;  . 
and  0.2  mM  EDTA)  at  125  volts.  The  gel  fragment  was  then  placed  in  the 
well  and  the  V-channel  filled  with  100  ^1   of  high  salt  buffer  (3M 
NaOAc ,  5%  glycerol,  0.01%  Bromophenol  Blue).  It  was  important  that  the 
gel  slice  remain  in  the  same  orientation  as  it  had  been  run  previously 
to  facilitate  the  removal  of  the  band.  The  band  was  electroeluted  at 
150  V  for  15-20  min.  after  which  the  high  salt  buffer  was  carefully 
removed  in  100  fxl   aliquots.  A  total  of  4,  100  pi  aliquots  were  removed 
from  each  channel.  Five  micrograms  of  glycogen  (Boehringer -Mannheim) 
were  added  and  the  sample  was  precipitated  with  1  ml  of  95%  ethanol  at 
-70°C  for  30  min.  The  DNA  fragment  was  then  recovered  by  centrifugation 
at  10k  rpm  for  30  min.  Fragments  isolated  in  this  manner  were  found  to 
be  directly  suitable  for  ligation  reactions  or  probe  preparation. 

DNA  ligation.  The  ligation  of  DNA  fragments  was  done  with  T4  DNA 
ligase  (New  England  Biolabs)  and  essentially  as  described  by  King  and 
Blakesley  (1986).  DNA  fragments  were  digested  with  the  appropriate 
enzymes  dictated  by  the  cloning  scheme  and  fragments  and  vectors  were 
mixed  in  10  pi  of  IX  ligation  buffer  (5X  =  250  mM  Tris-HCl  pH  7 . 6 ,  50 
mM  MgCl2,  25%  (w/v)  polyethylene  glycol  8000  (Eastman  Kodak),  5  mM  ATP, 
5  mM  dithiothreitol) .  Usually  the  vector  (a  pUC  plasmid)  was  treated 
with  phosphatase  prior  to  the  reaction  and  therefore  the  vector  to 
insert  ratio  was  -  3:1.  Blunt  end  ligations  were  carried  out  with  less 
than  20  pg/ml  of  total  DNA.  Sticky  ligations  were  done  at  20-40  /ig/ml 
and  diluted  after  4  hrs  at  room  temperature.  Generally  10-20  units  of 
ligase  were  added  for  sticky  end  ligations  and  200-40-0  units  for  blunt 
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end  ligations.  After  4  hours  the  reactions  were  diluted  1:2  with  IX 
ligase  buffer  and  an  additional  aliquot  of  ligase  added  to  the 
reaction.  The  reactions  were  then  incubated  overnight  at  14°C  (sticky 
end)  and  4°C  (blunt  end).  The  reactions  were  diluted  1:2  with  TE  and 
transfected  into  DH5  bacteria  as  described  by  the  methods  of  Bethesda 
Research  Laboratories,  and  Hanahan  (1983). 

Preparation  of  competent  bacterial  cells  for  transformation. 
Bacteria,  either  DH5  or  HBlOl,  were  grown  in  100  ml  of  Luria  broth  to 
an  OD590  =  0.375.  The  cells  were  divided  between  two  sterile  50  ml 
conical  tubes  and  placed  on  ice  for  10  min.  All  subsequent  procedures 
were  carried  out  at  4°C.  The  cells  were  then  harvested  by 
centrifugation  for  5  min.  at  5k  rpm.  The  supernatant  was  removed  and 
the  cells  gently  resuspended  in  10  ml  of  CaCl2  buffer  (60  mM  CaCl2,  10 
mM  PIPES  pH  7.0,  15%  glycerol).  The  cells  were  then  centrifuged  for  5 
min.  at  5k  rpm  and  gently  resuspended  again  in  CaCl2  buffer.  They  were 
then  placed  on  ice  for  30  min.  and  centrifuged  at  2.5k  rpm  for  5  min. 
The  cells  were  resuspended  in  2  ml  each  of  CaCl2  buffer  and  dispensed 
into  200  ^1  aliquots  and  frozen  at  -70°C  until  needed. 

Transformation  of  bacteria  with  plasmid  DNA.  Competent  bacterial 
cells,  either  DH5  or  HBlOl,  were  thawed  on  ice  and  5-10  fxl   of  the 
ligation  were  added  and  incubated  with  the  cells  for  30  min.  on  ice. 
The  DH5  cells  were  heat  shocked  at  42°C,  and  the  HBlOl  cells  at  37°C. 
The  cells  were  briefly  placed  on  ice  and  then  diluted  with  900  fj.1   of 
room  temperature  S.O.C.  (2%  Bactotryptone ,  0.5%  yeast  extract,  10  mM 
NaCl,  2.5  mM  KCl,  10  mM  MgCl2 ,  10  mM  MgS04) .  The  cells  were  incubated 
at  37 °C  for  1  hour  and  then  plated  on  TYN  (1%  Tryptone,  1%  yeast 
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extract,  0.5%  NaCl)  medium  with  ampicillin.  If  detection  of  insertion 
of  a  DNA  fragment  was  possible  (DH5  cells  and  pUC  plasmids)  then  30  /xl 
of  2%  X-gal  (5-bromo-4-chloro-3-indolyl-/9-D-galactoside)  and  20  ^l   of 
100  mM  IPTG  (Isopropyl-y9-D-thiogalactopyranoside)  were  included  with 
the  bacteria  spread  on  the  plate.  Resistant  colonies  grew  up  overnight 
and  white  colonies,  indicative  of  a  disrupted  lac  Z  gene,  were  picked 
for  further  analysis. 

Rapid  plasmid  preparation.   The  method  is  essentially  as  described 
by  Ish-Horowicz  and  Burke  (1981)  with  some  modifications.  One 
milliliter  of  saturated  overnight  culture,  grown  in  TYN  or  L-broth, 
was  centrifuged  for  20  sec.  in  an  Eppendorf  microfuge.  The  solutions 
for  preparation  of  DNA  were  the  same  as  for  the  large  scale  preparation 
described  above.  The  cells  were  resuspended  in  100  ^1  Solution  1  and 
incubated  for  5  min.  at  room  temperature.  Solution  2  (200  pi)  was 
added  and  incubated  on  ice  for  5  min.  Solution  3  (150  /il)  was  added 
and  incubated  on  ice  for  5  min.  The  cells  were  then  centrifuged  for  5 
min.  and  the  supernatant  extracted  with  phenol/chloroform.  The 
supernatant  was  then  precipitated  with  2  volumes  of  95%  ethanol  at  room 
temperature.  DNA  was  then  suitable  for  restriction  enzyme  digestion  and 
agarose  gel  analysis. 

Growth  and  preparation  of  cell  lines.   G127  cells  were  utilized  in 
all  transfections  and  were  grown  in  10  cm  tissue  culture  dishes  as 
monolayer  cultures.  The  medium  used  in  all  experiments  was  Dulbecco's 
modified  essential  medium  (Gibco)  supplemented  with  5%  calf  serum 
(Gibco) ,  5%  horse  serum  (Gibco),  2  mM  L-glutamine,  and  100  U/ml 
penicillin,  100  ug/ml  streptomycin.  To  initiate  a  cell  line  (histone 
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plasmid  and  pSV2neo)  or  transient  (histone  plasmid  only)  transfection 
the  cells  were  refed  with  10  ml  of  mediuin  2-4  hours  before  application   . 
of  the  DNA  precipitate.  Stable  cell  lines  were  initiated  by  the 
cotransfection  of  the  histone  plasmid  and  pSV2neo  in  a  10 : 1  ratio.  This 
was  done  essentially  as  described  by  Graham  and  van  der  Eb  (1973)  and 
Gorman  et  al .  (1982).   Plasmid  DNA,  usually  10  ,ig/construct ,  was 
diluted  to  450  pi  with  1  mM  Tris-HCl  pH  7.9,  0.1  mM  EDTA.  This  was  then 
mixed  with  50  Ml  of  2.5  M  CaCl2.  The  DNA  solution  was  then  added 
dropwise  to  500  /xl  of  2X  Hepes  Buffered  Saline  (280  mM  NaCl ,  50  mM 
HEPES,  1.5  mM  Na2P04,  pH  7.12  ±  0.05)  in  a  sterile  15  ml  conical  tube 
while  the  tube  was  vortexed.  The  precipitates  were  allowed  to  stand  for 
20  min.  and  were  grey  and  cloudy  in  appearance.  A  poor  precipitate  was 
obvious  as  settling  out  occurred  during  the  20  min.  incubation.  The  DNA 
precipitates  were  added  to  the  plates  dropwise  under  sterile  conditions 
with  gentle  swirling.  After  4  hours  the  medium  was  removed  and  the 
cells  were  shocked  for  1-2  min  with  15%  glycerol  in  medium.  This  was 
removed,  the  cells  washed  with  10  ml  of  incomplete  medium  and  refed 
with  20  ml  of  complete  medium.  For  transient  transfections  the  cells 
were  incubated  for  24-48  hours  and  then  harvested  (80-90%  confluency) 
as  described  below. 

Cell  lines  were  initiated  by  growing  the  cells  to  confluency. 
approximately  2-3  days.  At  this  point  the  cells  were  split  1:5  into 
five  plates  and  the  medium  was  supplemented  with  500  i^g/ml   of  Geneticin 
(G418,  Gibco).  The  aminoglycoside  phosphotransferase  3' (II)  gene 
carried  on  the  pSV2neo  plasmid  confers  resistance  to  this  antibiotic 
and  therefore  permits  cell  growth  if  present.  Cells  were  refed  with 


medium  +  G418  every  3-4  days  until  resistant  colonies  were  apparent 
and  most  of  the  other  cells  had  died.  This  usually  took  approximately 
2-3  weeks.  All  the  colonies  on  an  individual  plate  were  pooled  and 
subsequently  passaged  in  drug- free  medium- -these  were  referred  to  as 
polyclonal  cell  lines.  The  clone  name  for  a  cell  line  contains  several 
designations.  For  example:  pFO003pl,  the  pFO  designates  this  construct 
as  originally  derived  from  the  AHHG  41  clone  isolated  by  Sierra  et  al . 
(1982),  003  describes  the  deletion  construct,  and  pi  refers  to 
polyclone  number  1.  When  an  "m"  is  used  instead  of  a  "p"  this  indicates 
a  monoclonal  cell  line.  To  produce  monoclonal  cell  lines,  12 
individual  colonies,  2-3  from  each  plate,  were  picked  with  a  cotton 
plugged  sterile  pasteur  pipette  and  grown  in  24  well  cell  plates 
(Corning) .   After  these  cells  had  expanded  they  were  grown  in  6  and  10 
cm  dishes  as  described  above. 

Cell  lines  and  C127  cells  were  frozen  down  periodically  in  medium 
supplemented  with  20%  foetal  calf  serum  (Gibco)  and  10%  glycerol.  Cells 
were  washed  off  the  plate  in  Puck's  Saline  +  0.02%  EDTA,  centrifuged  at 
1500  rpm  for  2  min,  resuspended  in  freezing  medium  in  Nunc  Cryotubes, 
and  placed  at  -70°C. 

Southern  blot  analysis.   This  method  has  been  used  to  determine  the 
copy  number  of  the  individual  monoclonal  cell  lines  and  the  status  of 
the  integrated  constructs  with  respect  to  flanking  sequences  and  mode 
of  integration.  In  general,  DNAs  from  individual  monoclonal  cell  lines 
were  digested  to  completion  with  restriction  enzymes  in  the  buffer 
recommended  by  the  supplier.  The  restriction  enzyme  reactions  were 
stopped  by  the  addition  of  1/10  volume  of  running  dye-  (IX  TBE,  50% 
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glycerol,  0.2%  sodium  dodecyl  sulfate,  0.01%  bromophenol  blue,  and 
0.01%  xylene  cyanol)  and  heated  to  65°C  for  15  min.  The  DNA  was  then 
loaded  onto  1%  agarose  gels  and  run  16-18  hours  at  70  V.  Gels  were 
stained  in  ddH20  with  5  ug/ml  ethidium  bromide.  Next,  the  gels  were 
soaked  in  25  mM  HCl  for  10  min.  to  cause  strand  breaks  that  permit 
better  transfer  and  then  transferred  to  Zetabind  nylon  membranes  (AMF- 
Cuno)  as  described  by  Southern  (1975)  except  that  the  transfer  buffer 
was  0.4  M  NaOH  (methodology  kindly  provided  by  Dr.  Harry  Ostrer, 
University  of  Florida,  Department  of  Pediatric  Genetics).  Transfer  was 
complete  in  20-24  hrs .  The  filters  were  gently  washed  in  2X  SSC  (20X 
SSC  =  3M  NaCl,  0 . 3M  Sodium  Citrate,  pH  7.0)  3  times  for  15  min.  each. 
The  filters  were  briefly  air  dried  and  then  washed  in  O.IX  SSC,  0.5% 
SDS  for  1  hr  at  65  C.  At  this  point  filters  were  stored  at  4°G  in 
plastic  Seal-a-meal  bags.  Blots  were  prehybridized  in  5X  SSPE  (15X  SSPE 
=  2.69  M  NaCl,  150  mM  NaH2P04,  15  mM  EDTA,  pH  7.7),  0.1%  SDS.  and  1.0% 
non-fat  dry  milk  (Carnation)  at  67-68°C  for  4-6  hrs.  Hybridizations 
were  performed  in  the  above  solution  with  the  addition  of  either 
denatured  nick- translated  or  oligolabelled  probe.  For  blots  probed  with 
histone  H4  sequences  1-2  x  10°  cpm/ml  of  probe  were  used  in  the 
hybridization.  For  mouse  18S  ribosomal  RNA  hybridizations,  1-2  x  lO-" 
cpm/ml  of  the  pUC974  insert  probe  were  utilized.  The  specific  activity 
of  all  probes  was  at  least  1  x  10°  cpm/ug.  The  length  of  hybridization 
was  from  18  -  20  hrs  at  67-68°C.  Filters  were  washed  3  times  at  room 
temperature  with  agitation  in  5  mM  NaP04  pH  7.0,  2  mM  EDTA,  and  0.2  % 
SDS.  Each  wash  was  30  min  in  length.  After  a  brief  drying  period  the 
filters  were  sealed  in  plastic  bags  (to  prevent  dehydration  and 
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facilitate  the  subsequent  removal  of  probe  fragments)  and  exposed  to 
preflashed  XAR-5  film  (Kodak)  at  -70°C. 

Preparation  of  DNA  from  monoclonal  and  polyclonal  cell  lines.  The 
medium  from  each  plate  was  removed  and  2  ml  of  Puck's  saline  (Gibco) 
with  0.02%  EDTA  were  added.  The  cells  were  physically  removed  from  the 
plate  by  scraping  with  a  rubber  spatula  and  placed  in  a  sterile  15  ml 
Corex  tube.  The  cells  were  pelleted  by  centrifugation  at  1500  rpm  for 
2  min.  at  4°C  in  an  lEC- International  centrifuge.  At  this  point  the 
supernatant  was  removed  and  the  cells  were  snap  frozen  on  dry  ice. 
Frozen  pellets  were  quickly  resuspended  in  1  ml  of  O.IX  SSC,  1.0%  SDS , 
and  200  /ig/ml  proteinase  K  (Sigma  Chemical  Company)  and  incubated  for  4 
hrs  to  overnight  at  37°C.  This  mixture  was  then  extracted  2  times  and 
precipitated  with  2  volumes  of  95%  ethanol  at  -20°C  overnight.  The 
precipitated  nucleic  acids  were  recovered  by  centrifugation  at  lOK  rpm 
for  10  min.  at  4°C.  The  pellet  was  dried  briefly  and  resuspended  in  1 
ml  of  TE  and  RNaseA  (Sigma)  was  added  to  a  final  concentration  of  50 
/ig/ml.  Digestion  proceeded  for  1  hr  at  37 °C  and  was  stopped  by  the 
addition  of  SDS  to  0.5%  and  phenol/chloroform  extraction.  DNA  was  then 
precipitated  with  2  volumes  of  95%  ethanol,  centrifuged  at  lOK  for  10 
min,  and  the  pellet  resuspended  in  500  /il  of  TE  and  stored  at  4°C. 

Copy  number  analysis.  Approximately  30  ug  of  genomic  DNA  from  an 
individual  cell  line  were  diluted  to  50  /il  with  TE.  Digestions  were 
carried  out  in  EcoRI  buffer  (Boehringer-Mannheim)  with  the  following 
regime:  1  unit/ug  of  EcoRI  and  Xbal  were  added  and  incubated  at  37 °C 
for  4-8  hrs,  at  which  point  an  additional  1  unit/ug  was  added  and  the 
digestion  proceeded  overnight  (16-18  hrs).  The  DNA  was  quantitated  by 
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diluting  5  nl   of  the  digestion  into  1  ml  of  TE  and  determining  the 
OD260-  "^^^  completion  of  digestion  was  determined  by  gel 
electrophoresis  of  a  small  aliquot  of  the  digestion  on  a  1%  agarose 
minigel  (Bio-Rad) .  Ten  micrograms  of  digested  DNA  were  electrophoresed 
and  blotted  as  above  (Southern  Blotting) .  The  probes  used  for  the  copy 
number  determination  were  either  the  EcoRI/Xbal  fragment  from  pF0002 
(for  the  human  H4  histone  genes)  or  the  BamHI/Sall  fragment  from  p974 
(mouse  18S  ribosomal  gene  for  quantitation) .  The  probes  were  labelled 
by  either  nick- translation  or  oligolabelling  (see  below).  The  copy 
number  quantitation  of  the  human  H4  histone  gene  was  done  by 
densitometric  scanning  of  multiple  autoradiograms .  The  exact  amount  of 
DNA  in  each  lane  was  determined  by  reprobing  the  Southern  blots  with 
the  mouse  18S  ribosomal  gene.  This  gene  served  as  an  internal  control 
for  variations  in  the  actual  amount  of  DNA  loaded  and  any  loss  during 
the  process.  The  copy  number  of  the  mouse  18S  ribosomal  gene  should  be 
invariant  and  all  densitometric  values  for  the  human  H4  histone  genes 
were  corrected  to  account  for  the  actual  amount  of  DNA  in  the  lane 
based  on  the  internal  control. 

Labelling  of  DNA  fragments  using  Klenow  fragment.   This  was  done  as 
described  by  Maniatis  et  al .  (1982).  Two  hundred  nanograms  of  plasmid 
or  A  phage  DNA  were  digested  to  completion  with  the  restriction  enzymes 
of  choice.  One  to  two  microcuries  of  [a-~''^P]dCTP  were  added  with  -  0.5 
units  of  the  large  fragment  of  E.  coli  DNA  polymerase  I  (Klenow 
fragment,  BRL) .  The  reaction  was  incubated  for  10  min.  at  room 
temperature.  Then  2  /il  of  0 .  2M  EDTA,   100  /il  of  0.3M  sodium  acetate, 
and  20  /xg  of  yeast  tRNA  were  added  to  stop  the  reaction.  The  labelled 
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DNA  fragments  were  recovered  by  precipitation  with  95%  ethanol  at  - 
70° C.  The  DNA  was  recovered  by  centrifugation  and  resuspended  in  100  fil     . 
of  TE. 
-  :  Nick  translation  and  oligolabelling.   Both  of  these  methods  were 

•  .        utilized  for  the  production  of  DNA  hybridization  probes.  Nick 

translation  was  done  as  described  by  Rigby  et  al .  (1977).  For  the  copy 
number  experiments  the  EcoRI/Xbal  fragment  of  pF0002  was  isolated  with 
:  the  TBI  fragment  eluter  and  250  ng  were  used  in  the  reaction.  A  25  pi 

V--        reaction  was  composed  of  2.5  nl   of  lOX  buffer  (500  mM  Tris-HCl  pH  7.5, 
- ':        50  mM  MgCl2,  1  mg/ml  bovine  serum  albumin  (BSA,  Sigma  Fraction  V)), 
:-  '       2.5  Ml  of  lOX  nucleotides  (330  /^M  each  of  dATP,  dGTP,  dTTP) ,  40-80  /iCi 
of  a-^2p-dCTP,  2.5  units  of  E.  coli  DNA  polymerase  I  (BRL) ,  1  /il  of  a  1 
X  lO""^  dilution  of  DNasel  (stored  in  10  mM  HCl  at  1  mg/ml)  activated  at 
1:100  for  1-2  hours  on  ice  in  10  mM  Tris-HCl  pH  7 . 5 ,  5  mM  MgCl2 ,  1 
mg/ml  BSA.  The  reaction  was  begun  with  the  final  addition  of  the  DNasel 
and  incubated  at  14° C  for  45  min.  The  reaction  was  stopped  by  dilution 
with  TE  and  the  probe  purified  over  a  pipette  (10  mm  x  100  mm,  Fisher) 
column  of  Biogel  Al . 5ra  in  TE.  The  sample  was  applied  to  the  column  in  a 
200  /il  aliquot  and  200  /il  fractions  were  collected.  The  labelled  DNA 
usually  came  off  in  fractions  6-10.  These  were  pooled  and  quantitated   • 
in  the  scintillation  counter.  The  specific  activity  of  these  probes  was 
always  greater  than  1  x  10^  cpm/ug.  Oligo-labelling  was  done  as 
described  by  Feinberg  and  Vogelstein  (1983).  The  DNA  fragment  (100  to 
200  ng)  was  added  to  a  1.5  ml  Eppendorf  tube  and  ddH20  added  to  make 
the  final  volume  after  addition  of  the  other  components  either  12.5  pi 
or  25  m1-   This  tube  was  then  heated  to  95-100°C  for  two  minutes  and 
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placed  on  ice.  To  this  denatured  DNA  fragment  was  added  10  f^l   of  2X 
oligolabelling  buffer  (2X  =  500  mM  Hepes  pH  6.6,  50  /xM  each  of  dATP, 
dGTP,  dTTP;  125  mM  Tris-HCl  pH  8.0,  25  mM  2-mercaptoethanol ,  0.55  mg/ml 
mixed  hexanucleotides  (Pharmacia)).  We  added  25-50  /xCi  of  [a-^^pj^CTP 
and  2.5  units  of  Klenow  fragment  (BRL) .  The  reaction  was  allowed  to 
proceed  for  2  hours  to  overnight  and  purified  as  described  above  for 
the  nick  translation  reaction.  Specific  activity  of  these  probes 

Q 

usually  exceeded  2-4  x  10°  cpm//ig. 

Preparation  of  total  cellular  RNA.   Because  of  the  sensitivity  of 
histone  mRNA  to  degradation  following  the  cessation  of  DNA  synthesis, 
it  was  important  that  the  initial  steps  of  this  protocol  be  carried  out 
as  quickly  as  possible. 

The  medium  from  2-4  plates  was  removed  and  1  ml  of  cold  Puck's 
saline  (Gibco)  +  0.02%  EDTA  was  added  and  the  cells  were  immediately 
scraped  from  the  dish  and  transferred  to  a  sterile,  DEPC  treated,  corex 
tube.  The  cells  were  pelleted  in  the  clinical  centrifuge  at  a  setting 
of  five  for  2  min. ,  the  supernatant  was  removed  and  the  cells  were 
frozen  on  dry  ice  and  subsequently  stored  at  -20°C  for  no  more  than  a 
few  days.  Degradation  can  occur  quickly  and  therefore  it  was  necessary 
to  prepare  the  RNA  as  soon  after  harvesting  as  possible.  The  cell 
pellet  was  resuspended  in  1  ml  of  2mM  Tris  HCl  pH  7.4,  1  mM  EDTA,  and 
10  Mg/ml  polyvinylsulfate  (PVS,  Eastman  Kodak).  SDS  (10%)  was  added  to 
a  final  concentration  of  1%  and  proteinase  K  added  to  200  ^g/ml. 
Incubation  was  at  37°C  for  30  min.  at  which  point  5M  NaCl  was  added  to 
a  final  concentration  of  500  mM  and  the  incubation  continued  for  an 
additional  15  min.  The  total  nucleic  acids  were  extracted  with  2 
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volximes  of  phenol/chloroform,  2  times,  and  with  3  volumes  of 
chloroform  1  time.  The  total  nucleic  acid  was  then  precipitated  by  the 
addition  of  60  pil   of  3M  NaAc  and  2.5  vols  of  95%  ethanol  (-20°C 
overnight) .  The  nucleic  acids  were  recovered  by  centrifugation  at  lOK 
rpm  for  15  min.  at  4°C.  The  pellet  was  resuspended  in  500  ^1  of  10  mM 
Tris  HCl  (pH  7.4),  2  mM  CaCl2,  and  10  mM  MgCl2  with  the  addition  of  25 
fil   of  proteinase  K  treated  DNase  I  (see  below  for  preparation)  and 
digested  at  37 °C  until  it  was  completely  suspended  (this  usually 
required  from  30  min.  to  1  hr.,  intermittent  vortexing  helped  to 
disrupt  the  pellet).  When  the  pellet  was  no  longer  visible,  SDS  and 
NaCl  were  added  to  a  final  concentration  of  0.5%  and  250  mM, 
respectively.  The  solution  was  extracted  2  times  with  phenol/chloroform 
and  1  time  with  chloroform,  and  precipitated  with  3  vols  of  95%  ethanol 
overnight.  RNA  was  either  stored  in  water  at  -70°C  or  in  ethanol  at  - 
20°C.  Ethanol  suspensions  needed  to  be  vigorously  mixed  to  avoid 
quantitation  problems  with  the  RNA  aliquots.  RNA  stored  in  water  was 
also  mixed  before  removal. 

Preparation  of  RNase  free  DNasel.   Deoxyribonuclease  I  (Sigma) (1 
mg/ml  in  20  mM  Tris-HCl  pH  7.4,  10  mM  CaCl2)  was  preincubated  at  37°C 
for  20  min.  and  then  further  incubated  for  2  hrs .  at  37 °C  in  the 
presence  of  0.1  volumes  of  proteinase  K  (1  mg/ml  in  20  mM  Tris-HCl  pH 
7.4,  10  mM  CaCl2)  to  digest  any  contaminating  ribonuclease  activity  as 
described  by  Tullis  and  Rubin  (1980).  This  preparation  was  stable  on 
ice  for  several  hours  to  overnight. 

SI  nuclease  protection  assay.   This  method  is  essentially  as 
described  by  Berk  and  Sharp  (1977)  with  modifications-.  In  order  to 
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detect  the  hvunan  histone  H4  mRNAs  25  ng   of  total  cellular  RNA  from  a 
C127  cell  line  containing  an  integrated  human  H4  histone  gene  construct  • 
were  added  to  a  DEPC  treated  1 . 5  ml  Eppendorf  tube .  Sufficient  human 
and  mouse  probe,  labelled  with  [7--^'^P] ATP,  was  added  to  provide  an 
excess  (5  to  10  ng)  of  protected  fragment  in  the  reaction.  Probe  excess 
was  either  determined  by  titration  of  the  probes  with  a  stock  C127  or 
HeLa  RNA  sample  or  by  addition  of  twice  the  amount  of  probe  to  some 
reactions.  One  twentieth  volume  of  5M  NaCl  and  3  volumes  of  95%  ethanol 
were  added  and  the  solution  was  placed  on  dry  ice  for  15-30  rain.  The 
precipitated  RNA  and  probes  were  recovered  by  centrifugation  at  lOK 
rpm  for  15  min.  at  4°C.   The  pellet  was  briefly  dried  in  a  Savant  Speed 
Vac  (1-2  min.).  Four  microliters  of  5X  hybridization  buffer  (2M  NaCl, 
0.2  M  Pipes  pH  6.4,  and  5  mM  EDTA)  were  added  followed  by  16  pi  of 
recrystallized  formamide  (Specialty  Biochemicals) .  The  buffer  was  added 
first  to  the  pellet  to  facilitate  rehydration.  The  final  volume,  20  pi, 
was  vortexed  vigorously  to  resuspend  the  precipitated  RNA  and  probe. 
The  tubes  were  placed  at  90° C  for  10  min.  and  then  transferred 
immediately  to  a  55 °C  water  bath  and  incubated  for  12-18  hrs 
(overnight) .  Each  tube  was  removed  individually  from  the  water  bath  and 
the  reaction  diluted  immediately  with  8  volumes  of  ice-cold  SI 
digestion  buffer  (280  mM  NaCl,  50  mM  NaOAc ,  pH  4.5,  and  5  mM  ZnS04)  and 
placed  briefly  on  ice.  SI  nuclease  (Boehringer-Mannheim)  was  added  to  a 
final  concentration  of  3  units/pl  and  digestion  was  then  done  at  24- 
26°C  for  one  hour  and  at  4°C  for  15  min.  (the  tubes  were  placed  on 
ice) .  Ten  microliters  each  of  10%  SDS  and  5M  NH4OH  were  added  and  the 
reaction  was  extracted  and  precipitated  with  3  volume's  of  95%  ethanol. 
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The  length  of  precipitation  was  from  3-12  hours  at  -20°C.  (The 
precipitations  should  not  be  done  at  -70°C  as  this  will  cause  the 
formation  of  formamide  crystals).   The  precipitated  probe  fragment  was 
recovered  by  centrifugation  at  lOK  rpra  for  30  min.  The  pellet  was 
briefly  dried  and  resuspended  in  2-4  /il  of  loading  buffer  (80% 
formamide,  IX  TBE,  0.01%  Bromophenol  Blue,  and  0.01%  Xylene  Cyanol) . 
Samples  were  denatured  at  100°C  for  3  min.  and  placed  immediately  on 
dry  ice  until  loaded.  Samples  were  electrophoresed  on  a  6% 
polyacrylamide ,  8.3  M  urea  gel  at  a  SOW  constant  power  for  3-4  hours 
(the  acrylamide  to  bisacrylamide  ratio  was  20:1).  Gels  were  dried  and 
exposed  to  pref lashed  XAR-5  film  (Kodak)  at  -yO'C  with  Dupont  Cronex 
Lightning  Plus  Screens. 

DNA  sequencing.   All  sequencing  reactions  were  carried  out  exactly 
as  described  by  Maxam  and  Gilbert  (1980)  and  so  will  not  be  detailed 
here.  For  each  fragment  that  was  sequenced  the  G  (Dimethyl  Sulfate, 
(DMS));  G+A  (Formic  acid);  C-l-T  (Hydrazine);  C  only  (Hydrazine  in  high 
salt);  and  A>C  (1.2  N  NaOH)  reactions  were  done.  Single  end  labelled 
fragments  were  prepared  as  follows:  plasmid  DNAs  were  digested  with  an 
appropriate  restriction  endonuclease ,  treated  with  phosphatase,  and 
labelled  as  described  below.  After  the  DNA  was  labelled  it  was  digested 
with  a  second  restriction  enzyme  to  produce  two  single  end  labelled 
fragments.  To  purify  the  fragment  of  interest  for  analysis  we 
electrophoresed  the  DNA  on  a  native  4%  acrylamide  gel.  The  location  of 
each  labelled  DNA  band  on  the  gel  was  determined  by  exposure  to  Cronex 
(Dupont)  X-ray  film.  After  alignment  of  the  film  and  the  gel  we  excised 
the  bands  of  interest  and  eluted  them  in  500  ^lL   of  500  mM  ammonium 


57 


acetate,  10  mM  MgCl2 ,  0.5%  SDS ,  overnight  at  37°C  as  described  by 
Maxam  and  Gilbert  (1980).  The  acrylamide  gel  slice  was  ground  with  a 
siliconized  glass  rod  in  a  1.5  ml  Eppendorf  tube  prior  to  addition  of 
the  elution  buffer.  After  the  overnight  incubation  the  acrylamide  was 
centrifuged  to  the  bottom  of  the  tube  at  lOK  rpm  for  5  min.  The 
supernatant  was  removed  and  the  pellet  resuspended  in  200-400  ^1   of 
elution  buffer,  centrifuged,  and  the  supernatant  removed.  This 
procedure  routinely  resulted  in  recoveries  of  80-90%  of  the  labelled 
DNA  fragment.  The  pooled  supernatants  were  then  precipitated  twice  in 
succession  with  3M  Sodium  Acetate  and  95%  ethanol.  These  fragments  were 
then  used  in  the  sequencing  reactions  noted  above.  After  the  reactions 
were  carried  out  and  the  DNA  was  cleaved  with  piperidine  and 
lyophilized,  it  was  electrophoresed  (50W  constant  power)  on  a  6% 
acrylamide,  8 . 3M  urea  gel  (45  cm  x  30cm  x  0.5mm).  The  samples  were 
resuspended  in  6  fil   of  SI  loading  buffer  and  divided  into  two,  3  nl 
aliquots.  These  were  boiled  for  3  min.  and  placed  on  dry  ice.  To 
maximize  the  amount  of  the  sequence  we  could  read,  two  loadings  of  the 
reactions  were  done.  The  first  3  /il  sample  of  each  reaction  was  loaded 
and  electrophoresed  for  5-6  hours  or  until  the  Bromophenol  Blue  reached 
the  bottom  of  the  gel.  The  second  sample  was  then  loaded  and 
electrophoresed  for  an  additional  5-6  hours.  The  gel  was  then  dried  and 
exposed  to  either  Cronex  or  XAR-5  film  at  room  temperature  overnight. 

SI  nuclease  analysis  probe  preparation.   Two  probes  were  routinely 
used  to  quantitate  the  amount  of  human  and  mouse  histone  H4  mRNA 
present  in  cell  line  samples.  The  human  probe  was  prepared  by  digestion 
of  50-100  /ig  of  pF0005  or  pFO002  with  Ncol .  This  digestion  was  then 
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extracted,  precipitated,  and  the  DNA  recovered  by  centrifugation  at 
10k  rpm  for  15  min.   The  pelleted  DNA  was  resuspended  in  50  pL  of  50  mM  . 
Tris-HCl  pH  8.0,  0.1  mM  EDTA,  1  unit  of  calf  intestinal  phosphatase 
(CIP)  was  added  and  the  mixture  was  incubated  at  37 °C  for  30  rain.   An 
additional  aliquot  of  enzyme  was  added  and  the  DNA  incubated  for  30 
min.   The  reaction  was  stopped  by  the  addition  of  EGTA  (ethyleneglycol- 
bis-(;3-aminoethyl  ether) -N,N,N' ,N' , -tetraacetic  acid)  to  10  mM  and 
heated  to  65°C  for  20  min.  The  DNA  was  then  extracted  and  precipitated. 
The  DNA  was  resuspended  in  10  /iL  of  y-'^'^V-AT?    (100  ^Ci)  and  1  /iL  of  lOX 
Kinase  buffer  (500  mM  Tris-HCl  pH  7 . 6 ,  100  mM  MgCl2,  100  mM  2- 
mercaptoethanol)  .  After  resuspension,  15  units  of  T4  polynucleotide 
kinase  (United  States  Biochemical  Corporation)  were  added  and  the 
reaction  incubated  at  37 "C  for  45  min.  The  reaction  was  stopped  by 
extraction  followed  by  precipitation.  The  DNA  was  recovered, 
resuspended  and  digested  with  Hindlll  to  produce  a  probe  fragment 
labelled  at  the  Ncol  site  in  the  human  H4  gene.   The  reaction  was  was 
electrophoresed  on  a  1.0%  agarose  gel  in  IX  TBE  and  the  695  bp 
Ncol/Hindlll  fragment,  labelled  at  the  Ncol  site  purified  with  the  IBI 
fragment  eluter  as  described  by  IBI.  The  mouse  H4  probe  was  produced  in 
a  similar  manner  from  the  plasmid  pBR-mus-hi-l-H4-HinfI  (Seiler-Tuyns   • 
and  Birnstiel,  1981)  digested  with  BstNI.  The  labelled  1000  bp  BstNI 
fragment  was  isolated  and  used  as  a  control  in  each  SI  nuclease 
protection  assay.  Although  this  probe  was  not  single  end  labelled,  we 
had  no  ambiguities  because  of  this  fact.  To  make  the  probe  shorter  and 
single  end  labelled  would  have  possibly  obscured  the  protected  fragment 
of  the  human  H4  gene  (280  nt) .  Both  the  human  and  mouse  H4  Si  nuclease 
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probes  were  quantitated  on  agarose  gels  stained  with  ethidium  bromide 
and  exposed  to  Cronex  X-ray  film  to  judge  the  relative  strength  of 
each.  Generally  a  large  amount  of  probe  (several  micrograms)  was 
prepared  simultaneously  and  SI  nuclease  analysis  was  done  on  many 
samples  to  ensure  that  the  expression  was  measured  with  the  same 
strength  probe  in  each  case.  Variation  in  the  mouse  and  human  probe 
specific  activity  did  occur;  however,  the  data  presented  in  this  work 
were  prepared  primarily  from  a  large  set  of  Si  nuclease  assays  in 
which  many  cell  lines  were  assayed  side  by  side  with  the  same  mouse  and 
human  probe  preparation.  When  additional  cell  lines  were  subsequently 
measured,  samples  assayed  previously  were  included  to  ensure  that  the 
results  could  be  related  to  results  from  previous  assays. 

Densitometry  and  data  analysis.   Densitometry  of  autoradiograms 
was  done  to  quantitate  the  SI  nuclease  analysis  experiments  of  H4  gene 
expression  and  the  copy  number  of  the  cell  lines.  Several  films  of 
different  length  exposure  were  utilized  to  determine  the  intensity  of 
the  SI  protected  fragment  signal.  Two  densitometers  were  used,  a  Zeineh 
laser  densitometer  and  an  LKB-Pharmacia  high  intensity  laser 
densitometer.  Comparison  of  the  capabilities  of  each  densitometer 
demonstrated  that  for  most  films  either  one  was  adequate;  however  for 
particularly  low  intensity  signals  the  LKB  machine  gave  more 
reproducible  results.  The  data  collected  by  both  densitometers  were 
computer  processed  with  either  the  Videophoresis  II  (Zeineh,  Biomed 
Instruments)  or  the  GelScan  XL  programs  (LKB-Pharmacia).  Each  program 
was  successfully  used  to  analyze  the  intensity  of  radioactive  signals 
for  expression  and  copy  number.  The  areas  under  the  curve  for  the  SI 
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nuclease  analysis  (mouse  and  human)  and  the  copy  number  blots  (H4  and 
18S  ribosomal)  were  integrated  and  expressed  as  an  amount  of  absorbance 
units.  To  calculate  the  expression  of  a  particular  construct,  the  human 
expression  value  was  divided  by  the  mouse  value  and  expressed  as  a 
ratio.  Sample  calculations  for  copy  number  are  presented  in  Appendix  A 
and  for  Si  nuclease  analysis  in  Appendix  B. 

Agarose  and  acrvlamide  gel  electrophoresis.   Agarose  (Bio-Rad 
molecular  biology  grade)  gels  were  prepared  as  described  by  Maniatis  et 
al.  (1982).  The  buffer  was  IX  TBE  and  the  buffer  in  the  reservoir  was 
also  IX  TBE.  20  x  25  cm  gels  were  used  for  large  scale  fragment 
purification  and  Southern  blot  analysis  of  cell  line  DNAs .  Minigels 
were  used  for  checking  the  extent  of  digestion  and  analysis  of  rapid 
and  other  plasmid  preparations.  Acrylamide  gels  were  routinely  run  for 
SI  nuclease  analysis  and  consisted  of  6%  acrylamide  (20:1  acrylamide  to 
bis  acrylamide),  8.3  M  urea,  and  IX  TBE.  The  gel  solution  (75  ml)  was 
polymerized  with  the  addition  of  750  nl   of  10%  ammonium  persulfate  and 
20  nl   of  N,N,N' ,N' ,-tetra  methylethylenediamine .  It  was  immediately 
poured,  the  comb  put  into  place  and  allowed  to  harden  for  1  hour. 
Before  use  the  wells  were  rinsed  with  buffer  and  the  gel  was 
preelectrophoresed  for  30  min.  at  SOW  constant  power.  The  samples  were 
loaded  and  electrophoresed  at  50W  constant  power. 

Genomic  sequencing.  This  technique  was  done  as  described  by  Church 
and  Gilbert  (1984).  Monoclonal  cell  lines  pF0003ml,  5,  and  6  were  grown 
in  15  cm  plates  (10  per  construct).  Seven  of  the  10  were  treated  with 
0.5%  DMS  in  2-3  mis  of  medium  for  1-2  minutes.  Three  were  left 
untreated,  the  DNA  purified,  and  treated  with  DMS  in  vitro  as  a 
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control.  The  DMS  was  removed  from  the  plate  and  the  cells  washed  twice 
in  phosphate  buffered  saline  (PBS  =   150  mM  NaP04,  150  NaCl,  pH  7 . 2 ,  60 
mM  Tris-HCl,  pH  7.4).  The  DMS  treated  cells  were  scraped  from  the  plate 
and  the  DNA  purified  by  incubation  with  proteinase  K  as  described  above 
and  extraction.  To  purify  high  molecular  weight  DNA  only,  95%  ethanol 
was  slowly  added  to  the  tube  while  swirling  the  solution  with  a 
siliconized  glass  rod.  The  DNA  was  washed  off  the  rod  with  TE  and 
quantitated  spectrophotometrically.   The  purified  DNA  (30  /ig)  was 
restricted  with  Hinc  II,  treated  with  piperidine  and  lyophilized  as 
described  by  the  sequencing  protocol  of  Maxam  and  Gilbert  (1980) .  The 
samples  were  then  separated  in  a  6%  acrylamide  gel,  with  8  M  urea  and 
electrotransferred  to  a  nylon  membrane  (Genescreen) .  The  hybridization 
probe  was  prepared  as  described  by  Pauli  et  al.  (1987)  with  primer 
extension  of  a  fragment  cloned  into  M13.  In  our  experiments 
hybridization  was  performed  with  the  Hinc  II  5'  upper  strand  probe  at 
65°C  for  16  hrs,  followed  by  eight  5  min.  washes  at  65°C  (1  mM  EDTA,  40 
mM  NaHP04,  pH  7.2,  1%  SDS) .  The  membrane  was  then  exposed  to  preflashed 
XAR-5  film  at  -70  C.  In  these  experiments  I  was  responsible  for  the 
growth  of  the  cells  and  Dr.  Urs  Pauli  performed  the  rest  of  the 
experiment,  with  my  constant  encouragement,  and  occasional 
intervention. 

Statistical  analysis.   The  analysis  of  the  SI  nuclease  and  copy 
number  data  that  we  accumulated  was  suggested  by  Dr.  Mike  Conlon  of  the 
University  of  Florida  Biostatistics  Unit.  After  he  had  examined  the 
data  and  gained  an  understanding  of  the  complexities  involved,  he 
advised  that  we  employ  a  ranking  test,  the  Wilcoxon  Rank  Sum  Test.  This 
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test  makes  the  null  assumption  that  two  groups  of  data  that  are 
compared  came  from  the  same  random  distribution.  The  members  of  each 
group  are  assigned  a  rank  (i.e.  1,  2,  3,  ...)  from  highest  to  lowest 
in  both  groups.   For  example  if  we  had  two  sets  of  data,  A  =  1,  2,  4, 
6,  and  12  and  B  =  10.  14,  16,  19,  and  25,  the  members  of  group  A  and  B 
would  be  ranked  in  order  of  increasing  value.  The  absolute  values  of 
the  data  are  ignored  and  only  the  rank  is  examined. 

Group  A:(l,  2,  4,  6,  12)  is  converted  to  Ranks  =1,  2,  3,  4,  6. 
Group  B:(10,  14,  16,  16,  25)  is  converted  to  Ranks  =5,  7,  8,  9,  10. 
We  have  5  members  in  each  group  with  only  one  point  of  overlap 
between  the  two  groups  at  ranks  5  and  6 .  The  Rank  Sum  for  group  A  =  17 
and  for  group  B  =  39.  To  determine  if  the  difference  of  the  Rank  sums 
is  significant,  statistical  tables  of  probability  for  this  test  were 
employed.  These  two  groups  of  data  are  not  significantly  different  at 
p  <  0.05.  The  reason  is  the  small  sample  size.  With  only  five  members 
in  each  group  the  fact  that  one  of  the  members  of  each  group  falls  into 
the  range  of  the  other  group  precludes  any  significance.  As  the  groups 
become  larger  the  overlap  allowed  for  significance  becomes  greater.  I 
have  found  with  some  of  my  data  that  larger  sample  sizes  would  have 
been  necessary  to  employ  this  test  in  all  cases. 


CHAPTER  3 
HISTONE  H4  5'  REGULATORY  SEQUENCES 
It  has  been  established  that  the  steady  state  level  of  histone  mRNA 
during  the  cell  cycle  is  a  function  of  both  transcription  and  message 
stability.   These  two  components  of  histone  mRNA  metabolism  have  been 
studied  in  a  number  of  different  ways.   Earlier  studies  by  Plumb  et  al. 
(1983a,  b)  utilized  pulsed  incorporation  of  ^H-uridine  to  determine  the 
contribution  of  transcription  to  the  increase  in  histone  mRNA  levels 
during  the  S-phase  of  the  cell  cycle.   Later,  Baumbach  et  al.  (1987) 
used  nuclear  run-on  transcription  to  measure  transcription  of  the 
histone  genes  directly  during  the  cell  cycle.   The  increase  in 
transcription  during  early  S-phase  was  determined  to  be  3-5  fold  by 
both  Baumbach  et  al .  (1987)  and  Plumb  et  al.  (1983b).   In  the  studies 
of  Baumbach  et  al .  (1987),  message  stability  was  eliminated  as  a 
variable  in  the  experiments,  and  therefore  they  were  able  to  determine 
that  histone  gene  transcription  occurred  throughout  the  cell  cycle  at  a 
basal  level.   Instead  of  an  "on/off"  mechanism  for  transcriptional 
control  an  "enhancement"  was  apparent  during  the  first  4  hours  of  S- 
phase.   The  3-5  fold  enhancement  in  the  histone  gene  transcription 
level  has  been  duplicated  in  various  systems  and  by  different  methods 
during  the  last  5  years  (Sittman  et  al . ,  1983;  Heintz  et  al . ,  1983; 
Artishevsky  et  al . ,  1987). 
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The  implications  are  that  protein/DNA  or  protein/protein 
interactions  occur  that  stimulate  the  increased  level  of 
transcription.   Evidence  for  specific  protein/DNA  interactions  has  been 
gathered  by  Artishevsky  et  al.  (1987).   They  demonstrated,  at  the  end 
of  Gl  and  the  beginning  of  S  phase,  the  presence  of  a  factor  that 
interacted  with  the  proximal  promoter  region  of  the  hamster  H3 
promoter.   The  F0108  H4  gene,  with  which  my  work  has  been  done,  also 
demonstrates  protein/DNA  interactions  in  the  proximal  promoter  region 
(Pauli  et  al.,  1987,  van  Wijnen  et  al . ,  1987);  however,  there  are  no 
detectable  changes  in  these  interactions  during  the  cell  cycle.   Since 
it  has  been  demonstrated  that  transcription  of  the  F0108  H4  histone 
gene  proceeds  throughout  the  cell  cycle  at  a  basal  level,  it  was  of 
interest  to  discover  what  sequences  are  necessary  for  basal  and 
enhanced  expression.   The  promoter  of  the  F0108  H4  histone  gene  is 
potentially  extensive  and  so  deletions  that  encompass  the  entire  6 . 5  kb 
of  possible  promoter  sequence  were  prepared  and  analyzed.   In  the 
proximal  region  of  the  promoter  we  were  interested  to  understand  the 
functionality  of  elements  such  as  the  TATAA  box,  GGTCC  element,  Spl 
binding  site,  and  putative  CAAT  boxes.   More  distal  elements  have  also 
been  examined  and  these  included  a  possible  enhancer  and  negative 
regulatory  element  located  thousands  of  base  pairs  upstream. 

As  mentioned  in  the  introduction,  the  differences  encountered  in  in 
vivo  and  in  vitro  transcription  systems  have  sometimes  been 
considerable.   In  order  to  ascertain  the  functional  in  vivo  promoter 
sequences  of  the  F0108  human  H4  histone  gene,  we  constructed  a  series 
of  mouse  C127  cell  lines  each  containing  a  different  H4  promoter 
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deletion  construct.   As  described  in  the  prologue  to  the  Materials  and 
Methods  section,  we  decided  that  this  was  the  best  way  to  proceed.   We 
hoped  that  stable  integration  into  the  chromosome  would  give  the  most 
accurate  information  about  the  function  of  H4  promoter  sequences. 

Cell  line  construction 
The  first  step  in  these  experiments  was  to  construct  the  cell 
lines.   The  mouse  C127  cell  line  was  chosen  because  it  was  a 
heterologous  host  and  had  been  previously  used  to  support  the  stable 
expression  of  the  F0108  human  H4  gene  in  an  episomal  form  (Green  et 
al.,  1986).   Many  of  the  histone  H4  plasmid  DNA  constructs  were 
available  already  (Figure  3-1),  although  as  the  work  progressed  several 
more  were  prepared  to  answer  various  questions  that  arose.  The 
constructs  are  all  products  of  subclones  of  the  original  A  human 
histone  gene  clone  41  (AHHG41)  isolated  by  Sierra  et  al .  (1982)  and 
this  is  diagramed  at  the  top  of  Figure  3-1.   The  proximal  deletion 
constructs  J67,  J56,  J50,  K8 ,  and  L14  (Figure  3-1)  were  all  available 
and  had  been  made  by  Bal31  deletion  of  pFOlOSA  (Sierra  et  al . ,  1983). 
The  precise  determination  of  each  deletion  point  will  be  outlined  later 
in  the  chapter.   A  subclone  of  pF0108 .  pF0108A,  prepared  by  Sierra  et 
al.  (1983)  deleted  some  3'  sequences  including  an  Alu  repeat.   Plasmid 
pF0005  was  made  by  A.  van  Wijnen  from  a  Hindlll  digestion  of  pF0002 . 
Plasmid  pF0002  was  prepared  from  a  BamHl ,  PstI  digest  of  AHHG41  to 
obtain  a  fragment  with  1065  bp  of  5'  flanking  sequence.   Plasmid  pF0003 
was  prepared  from  an  Xbal  digest  of  AHHG41  and  has  6.5  kb  of  5' 
flanking  sequence.   Additional  clones  will  be  described  as  they  pertain 
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to  subjects  under  discussion  later  in  the  chapter- -positive  and 
negative  regulatory  elements. 

Initiation  of  Transcription  and  Basal  Regulation 
The  initiation  of  transcription  by  RNA  polymerase  II  and  the 
sequences  required  for  it  have  been  studied  in  considerable  detail  in  a 
number  of  genes,  as  outlined  in  the  introduction  (Reviewed  in  Shenk, 
1981) .   The  importance  of  the  TATA  box  has  been  established  in  vitro 
and  in  vivo,  and  it  is  thought  to  be  primarily  responsible  for  the 
specification  of  the  transcription  initiation  site.   We  constructed 
cell  lines  with  several  of  the  short  proximal  deletion  constructs  in 
order  to  ascertain  what  sequences  in  the  F0108  H4  histone  gene  were 
necessary  for  the  initiation  of  transcription.   The  general  protocol 
for  DNA  transfection  and  the  subsequent  selection  and  expansion  process 
is  outlined  in  Figure  3-2.   The  constructs  were  cotransfected  into  C127 
cells  with  the  plasmid  pSV2neo.   The  inclusion  of  the  pSV2neo  plasmid 
permitted  selection  for  expression  with  the  antibiotic  Geneticin 
(G418).   Once  resistant  cells  were  present  as  distinct  colonies  the 
plates  were  either  pooled  and  passaged  (polyclones)  or  picked  and 
expanded  as  monoclonal  cell  lines.   The  specific  method  is  described  in 
the  Materials  and  Methods  section ; 

To  determine  the  level  of  transcription  from  each  of  the  proximal 
deletion  constructs,  we  analyzed  cell  lines  early  in  passage.   The 
results  from  SI  nuclease  analysis  of  total  cellular  RNA  from  polyclonal 
cell  lines  108A,  L14,  K8 ,  J50,  J56,  and  J67  is  presented  in  Figure  3-3. 
RNA  was  prepared  from  each  cell  line  as  described  and  hybridized  to  two 
probes,  human  and  mouse,  at  55 °C  for  8-16  hours  as  described  in 


Figure  3-2     Flow  diagram  for  the  production  of  both  polyclonal  and 
monoclonal  mouse  cell  lines  that  contain  stable 
integrated  human  histone  H4  genes. 

The  method  relies  on  the  cotransfection  of  the  histone  plasmid  with  a 
selectable  marker,  pSV2neo .  This  plasmid  carries  the  gene  that  confers 
resistance  to  a  derivative  of  neomycin.  The  cotransfection  procedure 
permitted  the  pSV2neo  plasmid  to  be  taken  up  with  the  histone  plasmid 
into  the  mouse  C127  cells.   These  stable  cell  lines  were  utilized  to 
study  human  H4  gene,  expression.  The  specific  protocol  is  outlined  m 
materials  and  methods. 
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Figure  3-3     SI  nuclease  analysis  of  proximal  deletion  polyclonal 
cell  lines. 

SI  nuclease  analysis  was  done  as  described  in  Materials  and  Methods  and 
quantitated  by  densitometry.  Lanes:  the  cell  line  name  is  denoted  above 
the  lane.   For  example  polyclonal  cell  line  pF0108A  number  1  is  denoted 
as  108Apl;  C,  C127  total  cellular  RNA  and  H,  HeLa  total  cellular  RNA 
incubated  with  both  human  and  mouse  SI  probes  as  a  positive  control  for 
the  size  of  the  mouse  and  human  SI  protected  fragments,  respectively; 
M,  pBR322  Hpall  marker  labelled  with  a-^^p.^jcxp  and  Klenow  fragment. 
Both  human  (280  nt)  and  mouse  (110  nt)  protected  fragments  are  noted  at 
the  right. 
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Materials  and  Methods.   The  mouse  H4  histone  probe  was  included  as  an 
internal  control  in  each  SI  nuclease  assay  not  only  for  the  intactness   • 
of  the  RNA  preparation,  but  also  as  an  indicator  of  the  amount  of 
histone  mRNA  present  in  the  sample.   The  half- life  of  a  histone  mRNA 
after  the  cessation  of  DNA  synthesis  is  very  short  (Plumb  et  al . , 
1983a,  Sittman  et  al.,  1983),  and  therefore  the  growth  conditions  of 
the  cells  and  temperature  at  the  time  of  harvested  are  critical  for  the 
adequate  recovery  of  histone  mRNA. 

We  particularly  wanted  to  determine  if  there  was  a  minimal  amount 
of  promoter  that  could  initiate  transcription  in  vivo  and  if  this  was 
different  than  that  seen  in  vitro.   Previously  the  shortest  Bal31 
deletion.  J67 ,  had  been  shown  to  initiate  mRNA  synthesis  accurately  in 
vitro  in  a  whole  cell  extract  (Sierra  et  al.,  1983).  As  shown  in  figure 
3-3,  the  construct  J67 ,  which  we  later  learned  has  only  the  TATA  box 
and  the  GGTCC  element,  produced  no  correctly  initiated  transcripts. 
The  only  transcription  products  detectable  from  the  J67  construct  were 
inititated  upstream  of  the  normal  mRNA  start  site.   These  are  denoted 
with  arrows  in  Figure  3-3,  and  occur  in  the  cell  lines  with  J50,  J56, 
and  J67  integrated.   The  upstream  transcription  start  sites  map 
primarily  to  the  TATA  box  (-30  bp)  and  the  deletion  end  points.   The 
"deletion  end  point  transcripts"  originate  from  outside  of  the  histone 
flanking  sequences  either  in  the  plasmid  or  surrounding  chromosomal  DNA 
and  are  detected  by  virtue  of  the  lack  of  homology  between  the  probe 
and  the  mRNA  past  the  deletion  point. 

The  possibility  that  J67  was  unable  to  express  correctly  initiated 
H4  mRNA  was  based  on  a  single  polyclonal  cell  line.   To  assure 
ourselves  that  this  was  not  a  result  of  a  spurious  integration  event  we 
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Figure  3-4    Southern  blot  analysis  of  polyclonal  cell  lines: 

Intactness  of  5'  flanking  regions  and  copy  number  of  the 
constructs  in  each  cell  line. 

Genomic  DNA  purified  from  each  cell  line  was  digested  with  EcoRI  and 
Xbal,  electrophoresed,  blotted,  probed,  and  quantitated  as  described  in 
Materials  and  Methods.  Lanes:  1,  pFOlOSApl,  2,  pFO108Ap2,  3   L14p2   A 
L14p3,  5,  K8pl,  6,  K8p2 ,  7,  J50pl,  8,  J56pl  (passage  4).'9,'j56pl 
(passage  8),  10,  J67pla.  Histone  plasmid  markers  (EcoRI/Xbal  digested 
pF0002)  were  included  on  the  blot  equal  to  1.3  (10  pg) ,  6.5  (50  pg) , 
and  13  (100  pg)  gene  equivalents  per  diploid  genome  in  order  to 
quantitate  the  human  histone  H4  copy  number.  H,  HeLa  DNA  digested  with 
EcoRI  and  Xbal  as  a  positive  control  for  the  1070  bp  fragment.  M,  A  DNA 
digested  with  EcoRI  and  Hindlll  and  Klenow  labelled.  Pertinent  sizes 
are  denoted  to  the  right  in  kilobases.  The  probe  for  this  experiment 
was  the  EcoRI/Xbal  fragment  of  pF0002  that  had  been  nick- translated  as 
described  in  Materials  and  Methods. 
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determined  the  intactness  of  the  flanking  and  coding  sequences  for  each 
of  the  constructs  J67,  J56,  J50,  K8 ,  L14  and  108A  in  Figure  3-4.   This   ' 
experiment  also  permitted  us  to  determine  the  copy  number  of  each  cell 
line.   Ten  micrograms  of  genomic  DNA  from  each  cell  line  was  digested 
to  completion  with  EcoRl  and  Xbal  and  electrophoresed  on  a  1%  agarose 
gel,  blotted  and  probed  as  described  in  Materials  and  Methods.   In 
order  to  quantitate  the  copy  number  of  each  cell  line  the  gel  also 
contained  plasmid  DNAs  of  known  amounts  digested  with  both  EcoRI  and 
Xbal.   Ten,  50  and  100  pg  correspond  to  1.3,  6.5  and  13  gene 
equivalents  per  diploid  genome  respectively  as  designated  in  Figure  3- 
4.  Several  exposures  of  the  autoradiogram  were  scanned  with  a  Zeineh 
laser  densitometer  and  quantitated  in  comparison  to  the  controls. 
Additionally,  the  Southern  blot  in  Figure  3-4  was  quantitated  for  the 
actual  amount  of  DNA  by  densitometrically  scanning  a  photographic 
negative  of  the  gel  prior  to  transfer,  and  differences  in  DNA  amounts 
have  been  taken  into  account  in  the  copy  number  calculation.   Later, 
copy  number  blots  for  other  constructs  were  reprobed  with  a  clone  of 
the  mouse  18S  ribosomal  gene  kindly  provided  by  the  Dr.  David 
Schlessinger  (Washington  Univ. ,  St  Louis)  to  allow  exact  determination 
of  the  amount  of  DNA  loaded  in  each  lane  and  subsequently  transferred. 
A  sample  copy  number  calculation  in  which  the  ribosomal  probe  was 
utilized  is  presented  in  Appendix  A. 

The  Southern  blot  analysis  demonstrated  not  only  the  copy  number  of 
each  cell  line,  but  permitted  us  to  conclude  that  the  flanking  region 
of  most  constructs  was  intact.   The  mode  of  integration  for  the  histone 
plasmids  is  described  further  in  chapter  4. 
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Table  3-1     Quantitation  of  Polyclonal  Cell  Line  Expression. 


Cell  Line      Htiman /Mouse  Exp    Copy  number    F.xp/Copy  number 


lOSApl 

108Ap2 

L14p2 

Ll4p3 

K8pl 

K8p2 

J50pl 

J56pl 


0.016 
0.040 
0.018 
0.017 
0.028 
0.029 
0.034 
0.019 


1 
13 
1 
4 
3 
2 
1 
50 


0.016 
0.003 
0.018 
0.004 
0.009 
0.014 
0.034 
0.0004 


A  quantitative  summary  of  the  expression  data  from  the  polyclonal  cell 
lines  of  the  proximal  deletion  constructs.  The  human/mouse  expression 
ratio  was  determined  by  densitometry  of  the  Si  nuclease  protected 
fragments  in  Figure  3-3.  Copy  number  for  each  cell  line  was  determined 
from  the  Southern  blot  in  Figure  3-4.  Since  these  data  were  derived 
from  polyclonal  cell  lines  it  is  not  possible  to  interpret  the  results 
strictly,  and  we  would  like  to  note  that  copy  number  in  a  polyclonal 
cell  line  is  somewhat  ambiguous.  Expression  is  denoted  as  Exp. 
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The  results  of  the  SI  nuclease  analysis  and  copy  number 
determination  are  presented  in  Table  3-1.   The  SI  nuclease  assay  was 
similarly  quantitated  with  the  densitometer  and  the  results  are 
expressed  as  a  ratio  of  the  mouse  and  human  signals.   The  results, 
although  of  a  few  individual  cell  lines,  have  been  repeated  several 
times.   The  SI  nuclease  analysis  results  from  the  proximal  deletion 
polyclones  suggested  that  J67  (-47bp)  was  unable  to  correctly  initiate 
histone  mRNA  transcription.   Only  when  the  promoter  was  extended  in  J56 
(-73  bp)  was  correct  initiation  observed  (Figure  3-3).   It  can  be  seen 
from  the  data  in  Table  3-1  that  the  expression  per  copy  of  the  J56 
construct  (-73  bp)  is  quite  low  in  vivo  (expression/copy  =  0.0004),  and 
as  noted  later  this  may  be  somewhat  a  reflection  of  the  copy  number  and 
not  the  amount  of  5'  sequence  present  in  the  construct.   When  the 
flanking  sequences  are  extended  to  -100  bp  in  the  construct  J50  there 
is  an  apparent  80  fold  increase  in  the  expression/copy  ratio  (0.034). 
The  expression/copy  ratio  of  the  remaining  deletion  constructs 
stabilizes  at  a  value  of  0.02  to  0.01  with  increased  length  of  5' 
sequence.   This  25-50  fold  increase  is  probably  exaggerated  because  of 
copy  number  differences  between  J56  and  the  longer  constructs.   This 
phenomenon  (expression  versus  copy  number)  will  be  discussed  later  in   • 
the  chapter.  Still  it  is  likely  that  the  difference  in  the 
expression/copy  ratio  is  10  fold.   These  data  are  supported  by  the 
results  of  Ken  Wright  in  our  laboratory,  who  has  utilized  in  vitro 
transcription  to  define  the  functionality  of  proximal  promoter  elements 
and  demonstrated  that  in  nuclear  extracts  the  transcription  of  J50 
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*108A  *L14 

AGCCCGGTTGGGATCTGAATTCTCCCGGGGACCGTTGCGTAGGCGTTAAAAAAAAAAAAG 

-200      •         • 
TCGGGCCAACCCTAGACTTAAGAGGGCCCCTGGCAACGCATCCGCAATTTTTTTTTTTTC 

*K8 

AGTGAGAGGGACCTGAGCAGAGTGGAGGAGGAGGGAGAGGAAAACAGAAAAGAAATGACG 

-150 
TCACTCTCCCTGGACTCGTCTCACCTCCTCCTCCCTCTCCTTTTGTCTTTTCTTTACTGC 

*J50  *J56 

AAATGTCGAGAGGGCGGGGACAATTGAGAACGCTTCCCGCCGGGGCGCTTTCGGTTTTCA 

-100       .  .  .  • 

TTTACAGCTCTCCCGCCCCTGTTAACTCTTGCGAAGGGCGGCCGCGCGAAAGGCAAAAGT 

*J67 
ATCTGGTCCGATACTCTTGTATATCAGGGGAAGACGGTGCTCGCCTTGACAGAAGCTGTC 

-50        •  •  •  •  +1 

TAGACCAGGGTATGAGAACATATAGTCCCCTTCTGCCACGAGCGGAACTGTCTTCGACAG 


TATCGGGCTCCAGCGGTCATGTCCGGCAGAGGAAAGGGCGGAAAAGGCTTAGGCAAAGGG 

+50 
ATAGCCCGAGGTCGCCAGTACAGGGCGTCTCCTTTCCCGCCTTTTCCGAATCCGTTTCCC 


Figure  3-5     Schematic  diagram  of  the  proximal  human  histone  H4 
Bal31  deletion  mutants:  Sequence  analysis  of  the 
deletion  points. 

Each  construct  was  sequenced  according  to  the  protocol  of  Maxam  and 
Gilbert  (1980)  and  as  described  in  Materials  and  Methods.  The  deletion 
point  of  each  construct  is  denoted  with  an  asterisk  over  the  last 
nucleotide  included  in  the  sequence  of  that  construct.  For  reference 
the  ATG  codon,  TATA  box,  GGTCC  element,  GAAT  boxes  and  Spl  site  have 
been  underlined.  The  two  bolded  regions  of  the  promoter  correspond  to 
Site  I  and  Site  II,  the  DNAsel  protected  regions  of  protein/DNA 
interaction  as  defined  by  Pauli  et  al.  (1987). 
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(-100  bp)  is  several  fold  higher  than  J56  (-73  bp)  (  Ken  Wright, 
personal  comniunication)  . 

Previously,  the  deletion  points  of  the  Bal  31  deletions  had  been 
determined  by  restriction  enzyme  analysis  and  electrophoresis  on  high 
percentage  agarose  gels  (Sierra  et  al . ,  1983).   To  determine  exactly 
the  deletion  point,  each  construct  was  sequenced  by  the  method  of  Maxam 
and  Gilbert  (1980).   Ken  Wright  and  I  collaborated  in  this  effort  and 
the  approach  we  undertook  is  described  in  Materials  and  Methods. 
Importantly,  the  strategy  permitted  us  to  sequence  across  the  deletion 
point  in  each  construct  and  to  determine  the  exact  end  of  Bal31 
digestion.   The  deletion  points  we  determined  are  denoted  in  Figure  3- 
5. 

When  we  examined  the  sequence  of  the  J67  (-47bp)  deletion,  it  was 
obvious  that  the  GGTCC  element  and  TATA  box  were  still  present  and  the 
proximal  CAAT  box  (-53  bp)  was  absent.   Our  SI  nuclease  analysis 
suggested  that  this  was  not  sufficient  promoter  sequence  for  correct  in 
vivo  transcription  initiation.   To  ensure  that  this  was  indeed  the 
case,  we  prepared  5  additional  polyclonal  cell  lines  of  J67  and 
demonstrated  that  they  all  contained  integrated  constructs  (Figure  3- 
6b, c);  however,  none  expressed  a  correctly  initiated  histone  H4  mRNA 
(Figure  3 -6a).   The  absence  of  a  detectable  SI  protected  fragment  in 
the  J67  polyclonal  cell  lines  was  repeated  several  times.   Upstream 
initiation  of  transcription  was  sometimes  detectable  although  this  was 
not  consistent.   The  importance  these  results  became  apparent  when 
Drs.  Urs  Pauli  and  Susan  Chrysogelos  of  our  laboratory  demonstrated  the 
binding  of  proteins  to  the  proximal  promoter  region  of  this  H4  gene  in 


Figure  3-6     SI  nuclease  and  Southern  Blot  analysis  of  J67  polyclonal 
cell  lines  for  correct  human  H4  expression  and  copy 
number. 

Additional  J67  polyclonal  cell  lines  were  made  to  confirm  that  this 
construct  was  unable  to  initiate  human  H4  mRNA  transcription  correctly. 
A.  SI  nuclease  analysis  of  25  /xg  total  cellular  RNA  from  5  new  J67 
polyclonal  lines  and  the  one  tested  previously,  J67pla.  Also  shown  are 
polyclonal  lines  108Ap4  and  108Xp2 .  H,  HeLa  total  cellular  raA.   C, 
C127  total  cellular  RNA.   M,  pBR322  Hpall  markers.  The  human  H4  SI 
protected  fragment  (280  nt)  is  noted  with  an  arrow  at  the  left.  There 
was  no  detectable  human  H4  signal  in  any  of  the  J67  lanes  even  upon 
repetition  and  long  exposure.  B.  Southern  blot  analysis  of  J67 
polyclonal  cell  line  for  copy  number  determination.  J57  polyclones  1-5 
and  pFO108Aml2  are  shown.  The  position  of  1070  bp  is  noted  and  the 
arrow  indicates  the  size  of  the  deletion  EcoRI/Xbal  fragment  from  J67. 
Plasmid  DNAs  in  the  amount  of  10,  50,  and  100  pg  were  included  for  copy 
number  quantitation  as  described  in  Fig  3-4.   H,  HeLa  cell  DM  digested 
with  EcoRI  and  Xbal ;  C,  C127  cell  DNA  digested  with  EcoRI  and  Xbal .  C. 
The  blot  in  B  was  reprobed  with  the  18S  mouse  ribosomal  fragment  for 
quantitation  of  the  amount  of  DNA  in  each  lane.  The  size  of  the  18S 
band,  1.3  kb,  is  noted  at  the  right.  Quantitation  was  done  as  described 
in  Materials  and  Methods  and  Appendix  A. 
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vivo  (Pauli  et  al . ,  1987).   The  specific  areas  of  protein/DNA 
interaction  as  defined  by  DNase  I  protection  are  outlined  in  Figure  3-5 
with  the  construct  deletion  end  points.   Interestingly,  the  J67 
deletion  point  is  located  in  the  middle  of  Site  II  and  leaves  the 
proximal  portion  with  the  GGTCC  element  and  TATA  box  intact.   It  would 
appear  that  the  absence  of  Site  I  and  the  presence  of  only  half  of  Site 
II  are  insufficient  for  transcription  initiation  in  vivo.   However, 
when  all  of  Site  II  is  present  in  the  of  construct  J56  a  low  but 
detectable  level  of  transcription  is  present  (Figure  3-3  and  Table  3- 
1).   The  large  increase  in  the  expression/copy  ratio  of  the  J50  (-100 
bp)  construct  is  apparently  the  result  of  remarkable  similarity  to  the 
Spl  (Dynan  and  Tjian,  1983b)  binding  site  as  described  by  Briggs  et  al . 
(1985)  and  Evans  et  al.  (1988).   Although  we  have  not  proven  that  the 
protein/DNA  interaction  at  this  site  is  the  result  of  Spl,  it  seems  a 
strong  possibility  that  it  could  be  Spl  or  a  similar  protein.   J50  also 
includes  a  putative  CAAT  box,  however  the  functionality  of  this 
sequence  is  in  question  because  it  lacks  the  necessary  homology  to  the 
consensus  sequence.   Additionally,  this  CAAT  box  is  not  entirely 
included  in  the  protein  binding  domain  of  Site  I  as  described  by  Pauli 
et  al.  (1987)  and  it  is  therefore  unlikely  that  it  functions  in  the 
same  capacity.   It  should  be  mentioned  that  Spl  has  been  shown  to 
interact  with  CTF  in  the  HSVtk  promoter  (Jones  et  al . ,  1985),  and 
possible  interaction  in  the  histone  promoter  should  not  be  ruled  out 
immediately,  however  it  is  unlikely.  The  CAAT  sequence  is  well 
conserved  evolutionarily  in  conjunction  with  the  GGTCC  element  (Wells, 
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1986)  and  our  results  suggest  that  the  removal  of  this  element  in  the 
distal  half  of  Site  II  prevents  correct  transcription  initiation. 

We  investigated  the  whether  any  diatl  promoter  elements  had  an 
effect  on  the  transcription  of  the  F0108  human  H4  histone  gene. 
Polyclonal  cell  lines  were  prepared  from  constructs  pF0005  (-417  bp) , 
pF0004  (-6.0  to  -7.5  kb) ,  pF0002  (-1065  bp) ,  and  pF0003  (-6.5  kb) .  The 
results  of  the  SI  nuclease  analysis  and  limited  copy  number  analysis  on 
these  cell  lines  suggested  that  upstream  sequences  beyond  those  already 
examined  might  contribute  to  an  increased  level  of  expression  (data  not 
shown).   Upon  reflection,  it  is  likely  that  in  most  cases,  the 
increased  level  of  expression  we  noted  was  the  result  of  high  copy 
number,  and  not  necessarily  because  of  a  strong  promoter  sequence  such 
as  an  enhancer.   These  results,  although  limited  at  the  time,  prompted 
us  to  examine  in  a  more  rigorous  way  the  distal  5'  promoter  sequences 
of  the  FO108  H4  histone  gene  for  possible  regulatory  areas  that  control 

expression. 

Transfection  of  the  constructs  pFOOOS  (-417  bp) ,  pF0002  (-1065  bp) , 
and  pFOOOS  (-6.5kb)  into  mouse  C127  cells  was  done  to  assess  any  distal 
contributions  to  the  expression  level  of  this  H4  gene.   As  stated 
previously  enhancer  and  silencer/negative  regulatory  elements  can  be 
located  at  considerable  distances  from  the  promoter  of  a  gene  and  still 
accentuate  or  depress  expression  of  the  linked  gene  (Maniatis  et  al . , 
1987,  Theisen  et  al.,  1986,  Baniahmad  et  al.,  1987).   The  new  cell 
lines  were  grown  primarily  as  monoclones,  and  for  continuity  with  the 
previous  studies,  monoclonal  cell  lines  of  pF0108A  and  K8  were  also 
prepared. 
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I  will  state  now  that  we  have  found  that  there  is  a  competition 
between  the  transfected  human  H4  histone  genes  and  the  endogenous  mouse  . 
H4  gene  for  regulatory  factors  and  this  is  discussed  later  and  in 
chapter  4.   The  interpretation  of  expression  from  each  construct  is 
affected  by  this  competition  phenomenon,  and  becomes  rather  confusing. 
We  bring  this  up  here  only  to  make  the  reader  aware  that  this  situation 
exists,  and  the  results  have  been  interpreted  several  ways,  sometimes 
with  this  taken  into  account.   It  has  been  extremely  difficult  to 
understand  the  relationship  that  exists  between  the  endogenous  mouse  H4 
genes  and  the  transfected  human  H4  genes.   We  have  analyzed  the 
expression/copy  data  carefully  to  decipher  any  trends.   The  results  of 
this  analysis  are  also  reviewed  in  chapter  4.   The  choice  of  the  mouse 
H4  as  an  internal  control  for  the  SI  nuclease  analysis  was  both 
fortunate  and  detrimental  to  our  interpretation.   In  short,  the  entire 
expression  analysis  is  presented  here,  but  because  of  the  realization 
later  in  the  course  of  this  work  about  copy  number  and  competition  for 
transcription  factors,  only  some  of  the  data  will  be  incorporated  into 
the  final  synopsis. 

The  monoclonal  cell  lines  were  analyzed  for  the  level  of  expression 
and  copy  number  present.   The  Si  nuclease  analysis  of  the  pF0003 
monoclonal  cell  lines  is  presented  in  Figure  3-7  and  was  done  as 
described  in  Materials  and  Methods.   Almost  all  of  the  monoclones  were 
positive  for  expression  of  the  human  H4  histone  gene  with  the  exception 
of  pFO003ml8.   We  utilized  several  exposures  to  determine, 
densitometrically,  the  level  of  expression  from  each  cell  line.   The 
expression  data  are  presented  as  a  ratio  of  the  human  and  mouse 
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Figure  3-7     SI  nuclease  analysis  of  pF0003  monoclonal  cell  line 


s. 


SI  nuclease  assays  were  performed  as  described  in  Materials  and 
Methods.   Almost  all  15  clones  shown  here  are  positive  for  expression 
of  the  human  H4  gene.  The  exception  is  pFO003ml8.  H,  HeLa  total 
cellular  RNA.  C,  C127  total  cellular  RNA.  M,  pBR322  digested  with  Hpall 
and  labelled  with  o-^^P-dCTP  and  Klenow  fragment.   Dilutions  of  the 
marker  are  noted  as  1:4.  1:8.  1:16  and  1:32  for  densitometry  purposes. 
The  human  (280  nt)  and  mouse  (110  nt)  protected  fragments  are  denoted 
with  labels  and  arrows  at  the  left.  The  clone  numbers  appear  above  the 
individual  lanes  to  which  they  correspond. 


Figure  3-8     Southern  blot  analysis  of  pF0003  monoclonal  cell  lines. 

Southern  blot  analysis  was  performed  as  described  in  Materials  and 
Methods    10  Mg  of  DNA  from  each  cell  line  were  analyzed  with  nick 
translated  EcoRI/Xbal  fragment  from  pFO002.  A.  pFOOOS  cell  1^"«  f^ 
probed  with  H4  sequences.  B.  The  histone  probe  was  removed  and  the  blot 
was  reprobed  with  the  mouse  18S  ribosomal  fragment.  Densitometry  of  the 
1070  bp  band  specified  by  the  arrow  in  A  and  the  IBS  ribosomal  band  m 
B  permitted  quantitation  of  the  copy  number  through  normalization  to 
the  amount  of  DNA  actually  loaded  and  transferred  as  described  m  the 
Materials  and  Methods.  The  figure  in  A  is  a  composite  of  several 
exposures  that  reflects  the  actual  copy  number  and  accounts  tor  _ 
original  quantitation  errors.  The  plasmid  controls  for  quantitation  are 
labelled  10,  50  and  100  designating  the  number  of  pg  loaded.  ^'^^^' 
cellular  DNA.   H,  HeLa  cellular  DNA.  M,  A  DNA  digested  with  EcoRI  and 
Hind  III  and  labelled  with  a-32p-dCTP  and  Klenow  fragment.  The  number 
of  each  clone  is  designated  above  the  lane. 
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Figure  3-9     Si  nuclease  analysis  of  pFOlOSA  and  pF0002  monoclonal 
cell  lines. 

SI  nuclease  assays  were  performed  as  described  in  and  Materials  and 
Methods   The  left  panel  is  representative  of  results  obtained  from 
F0108A  cell  lines;  the  right  panel  with  total  cellular  RNA  from  pF0002 
cell  lines.  The  human  and  mouse  protected  fragments  are  designated  with 
labels  and  arrows.  The  markers,  M,  are  pBR322  digested  with  Hpall  and 
important  sizes  are  noted.   The  number  above  each  lane  corresponds  to 
the  clone  number  of  that  construct.   The  markers  were  diluted  Ml: 4  and 
Ml: 8  for  densitometry  quantitation  purposes.  H,  HeLa  total  cellular 
RNA.  C,  C127  total  cellular  EINA. 


Figure  3-10    Copy  number  analysis  of  pF0002  and  pFOlOSA  monoclonal 
cell  lines. 

Southern  blot  analysis  was  performed  as  described  in  Materials  and 
Methods    10  /ig  of  DNA  from  each  cell  line  were  analyzed  with  nick 
translated  EcoRI/Xbal  fragment  from  pF0002 .  A.  pFGlOSA  and  pFO002  cell 
line  DNA  probed  with  H4  sequences.  B.  The  histone  probe  was  removed  and 
the  blot  was  reprobed  with  the  mouse  18S  ribosomal  fragment. 
Densitometry  of  the  1070  bp  band  specified  by  the  arrow  in  A  and  the 
18S  ribosomal  band  in  B  permitted  quantitation  of  the  copy  number 
through  normalization  to  the  amount  of  DNA  actually  loaded  and 
transferred  as  described  in  the  Materials  and  Methods.  The  figure  m  A 
is  a  composite  of  several  exposures  that  reflects  the  actual  copy 
number  and  accounts  for  original  quantitation  errors.  The  plasmid 
controls  for  quantitation  are  labelled  10,  50  and  100  designating  the 
number  of  pg  loaded.   C,  C127  cellular  DNA.   H,  HeLa  cellular  DNA^  M 
A  DNA  digested  with  EcoRI  and  Hind  III  and  labelled  with  a-   P-dCTP  and 
Klenow  fragment.  Each  set  of  clones  is  designated  with  the  black  bar 
and  the  number  of  the  individual  clones  is  above  the  lane. 
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densitometry  signals  in  Table  3-2  (p.  103).   The  average  expression  of 
nine  pF0003  monoclonal  cell  lines,  for  which  copy  number  was  later 
determined,  was  2.29  ±  2.43. 

It  was  obvious  that  these  results  varied,  so  the  copy  number  of 
each  cell  line  was  determined  from  the  southern  blots  in  Figure  3-8a,b. 
The  Southern  blots  of  pF0003  monoclonal  cell  line  genomic  DNA,  digested 
with  EcoRI  and  Xbal ,  were  prepared  as  detailed  earlier  and  in  Materials 
and  Methods .   The  hybridization  probe  was  the  1070  bp  EcoRI/Xbal 
fragment  isolated  from  pF0002  and  nick- translated.   The  actual  copy 
number  of  each  cell  line  was  determined  by  densitometric  analysis  of 
the  1070  bp  EcoRI/Xbal  band  with  normalization  for  the  amount  of  DNA 
actually  loaded.   The  amount  of  DNA  in  each  lane  was  determined  by 
removal  of  the  histone  probe  at  80° C  in  O.IXSSC  and  subsequent 
hybridization  with  the  oligo-labelled  BamHI/Sall  fragment  of  the  mouse 
IBS  ribosomal  gene.   Densitometry  of  the  18S  ribosomal  band  (Figure  3- 
8b)  permitted  normalization  of  the  histone  H4  copy  numbers  and 
comparison  to  the  plasmid  controls  for  copy  number  (see  Appendix  A  for 
sample  calculation  of  copy  number) . 

The  copy  number  data  helps  to  explain  some  of  the  variation  seen 
with  the  original  expression  determination  for  each  cell  line.   When 
pF0003  copy  number  is  taken  into  account  for  the  expression  data  in 
Table  3-2,  the  expression/copy  ratio  for  all  of  the  cell  lines  is 
lowered  and  the  average  expression/copy  is  0.094  ±  0.091.   It  is 
apparent  from  the  data  in  Table  3-2  that  as  copy  number  increases,  the 
expression/copy  increases  until  approximately  20-40  copies  are  present, 
after  which  it  declines.   The  pFO003M15  cell  line  is  perhaps  lower  than 
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expected  with  respect  to  expression  because  of  an  unusual  or 
deleterious  integration  site.   The  threshold  of  expression  at  20-40 
copies  indicated  that  a  limited  number  of  human  histone  genes  could  be 
integrated  and  expressed  in  any  one  cell.   This  phenomenon  has  been 
investigated  further  and  is  discussed  later  in  light  of  genomic 
sequencing  data  presented  in  Chapter  4.   Overall  the  pF0003  monoclonal 
cell  lines  had  higher  expression  levels  than  other  cell  lines  (compare 
expression  values  with  others  in  Table  3-2),  but  the  expression/copy 
was  similar.   Since  copy  number  was  implicated  in  the  level  of 
expression,  we  also  calculated  the  average  copy  number  of  each  group  of 
monoclonal  cell  lines  and  this  is  presented  in  Table  3-2.   The  level  of 
expression,  as  we  have  determined  it  here  (Table  3-2).  is  a  direct 
reflection  of  the  copy  number. 

The  results  of  the  Si  analysis  of  the  pFGlOSA  and  pF0002  monoclonal 
cell  lines  are  presented  in  Figure  3-9.   Both  cell  lines  expressed  at  a 
relatively  low  level  and  the  numerical  data  are  presented  in  Table  3-2. 
The  average  level  of  expression/copy  for  pFOlOSA  is  .079  ±  .061  and  for 
pF0002  is  0.045  ±  0.053.   The  data  collected  for  the  pFOlOSA  monoclones 
were  previously  divided  into  two  groups.   Originally,  there  was  a 
construct,  designated  J40 ,  that  after  sequencing  of  the  deletion  points 
was  found  to  be  identical  to  pFOlOSA.   Therefore,  these  data  were 
incorporated  into  the  108A  data  base.   It  is  interesting  to  note  that 
pFOlOSA  and  J40  were  thought  to  have  different  lengths  of  5'  sequence 
and  yet  their  expression  was  shown  to  be  almost  identical.   This 
separation  of  the  original  observations  lends  a  measure  of  confidence 
to  the  analysis  process  that  has  been  used  in  these  studies. 
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Figure  3-11    SI  nuclease  analysis  of  pF0005  monoclonal  cell  lines. 

Twenty  five  micrograms  of  total  cellular  RNA  from  each  of  the  cell 
lines  were  treated  as  described  in  Materials  and  Methods  and  the 
autoradiograph  of  the  SI  nuclease  analysis  was  quantitated  by 
densitometry.  Lanes  are  designated  with  the  clone  number  of  the  cell 
line.  HeLa  cell  total  RNA  hybridized  to  both  human  and  mouse  probes,  H. 
C127  RNA  hybridized  to  both  human  and  mouse  probes,  C.  pBR322  Hpall 
markers  labelled  with  a-32p.dCTP  and  Klenow  fragment,  M.  One  fourth  the 
amount  of  marker  was  electrophoresed  for  quantitation  purposes,  Ml:4. 
The  construct  name,  pF0005 ,  is  displayed  above  the  black  line.  Both 
human  and  mouse  (280  nt  and  110  nt  respectively)  protected  fragments 
are  noted  at  the  right. 


Figure  3-12     Copy  number  analysis  of  pFOOOS  monoclonal  cell  lines. 

Southern  blot  analysis  was  done  as  described  in  Materials  and  Methods. 
All  abbreviations  are  as  designated  in  Figure3-10.  The  quantitation  of 
the  histone  H4  blot  (A)  and  the  mouse  18S  ribosomal  probed  blot  (B)  are 
as  before  in  Fig  3-10  and  Materials  and  Methods. 
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The  expression  data  for  pF0002  monoclones  9  and  10  were  determined 
(Table  3-2);  however,  when  the  copy  number  was  determined  there  was  no 
correct  band  at  1070  bp  or  any  additional  bands  that  corresponded  to 
the  EcoRl/Xbal  fragment  (data  not  shown) .   The  copy  number  of  the 
pF0002  and  pFOlOSA  cell  lines  (Figures  3-10a,b)  was  determined  as 
described  for  pF0003  and  in  Materials  and  Methods.   With  respect  to 
pF0002m9  and  10,  we  assume  that  either  the  construct  was  lost  in  the 
time  between  the  harvesting  of  cells  for  the  purification  of  RNA  and 
subsequently  DNA,  or  that  the  integration  event  destroyed  one  of  the 
restriction  sites  making  detection  impossible.   This  was  the  only  case 
where  expression  of  the  human  H4  histone  gene  was  detected  but  no 
copies  were  detectable.   Due  to  the  constraints  of  the  tissue  culture 
system  we  usually  prepared  several  plates  of  cells  for  the  isolation  of 
RNA,  and  then  1  or  2  passages  later  was  able  to  harvest  cells  for 
isolation  of  DNA.   The  lanes  of  the  pF0002  Southern  blot  exhibited  no 
other  bands  that  might  have  corresponded  to  the  integrated  pF0002 
construct. 

SI  nuclease  analysis  and  copy  number  determination  were  also  done 
for  the  pF0005  monoclonal  cell  lines  and  the  results  are  presented  in 
Figures  3-11,  and  3-12a,b.   pF0005  exhibited  the  most  consistency  in 
the  level  of  expression  (0.546  ±  0.354)  and  nearly  every  monoclonal 
line  was  positive  for  expression  of  the  human  H4  gene.   The 
expression/copy  ratio  was  0.201  ±  0.140. 

The  shortest  deletion  construct  for  which  monoclonal  cell  lines 
were  made  was  K8 ,  an  original  Bal31  deletion  (Sierra  et  al.,  1983). 
The  expression  from  all  six  monoclones  measurable  was  relatively  low 
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Figure    3-13 


SI  nuclease  analysis  of  K8  monoclonal  cell  lines. 


Si  nuclease  assays  were  done  as  described  in  Materials  and  Methods.  The 
clone  number  of  each  cell  line  is  denoted  above  the  lane.  M,  pBR322 
digested  with  Hpall  and  labelled  with  Klenow.  H,  HeLa  total  cellular 
RNA.  C,  C127  total  cellular  RNA.  Dilutions  of  the  marker  are  specified 
1:4  and  1:8.  Human  and  mouse  H4  protected  fragments  are  specified. 


Figure  3-14     Copy  number  analysis  of  K8  monoclonal  cell  lines. 

Southern  blot  analysis  was  performed  as  described  in  Materials  and 
Methods.   The  K8  EcoRI/Xbal  fragment  is  shorter  due  to  the  Bal  31 
deletion  and  is  designated  with  an  arrow  at  the  left.  The  same  controls 
as  in  Figure  3-10  have  been  included.  Quantitation  of  A  (histone  H4 
probe)  and  B  (reprobed  with  mouse  18S  ribosomal)  was  as  described  in 
Figure  3-10  and  Materials  and  Methods.  Nonessential  lanes  in  B  have 
been  deleted. 
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(Table  3-2.  Expression  =  0.114  ±  0.066,  Expression/Copy  number  =  0.075 
±  0.077).   In  addition,  there  were  several  K8  monoclones,  including 
K8ml2,  19,  and  20,  in  which  there  was  an  SI  nuclease  protected 
fragment  present  by  visual  inspection,  but  the  level  was  below  that 
detectable  with  the  densitometer.   The  SI  nuclease  protection  assay  and 
Southern  blot  analysis  are  presented  in  Figure  3-13  and  3-14a,b. 
These  results  were  in  agreement  with  the  previous  polyclonal  cell  line 
results  that  we  had  obtained  that  suggested  that  an  increase  in  the 
length  of  the  H4  promoter  resulted  in  increased  expression. 

We  were  also  concerned  that  differences  in  the  3'  end  of  some  of 
our  constructs  might  affect  the  level  of  expression.   The  differences 
in  the  3'  ends  of  the  constructs  were  not  intentional,  but  arose  as  a 
result  of  the  cloning  strategies  employed  to  produce  the  5'  deletions. 
To  address  this  question  we  prepared  the  construct  pFOlOSX  (see 
Appendix  C) .   This  construct  has  -210  bp  of  5'  flanking  sequence,  but 
the  Xbal/Hindlll  fragment  at  the  3'  end  has  been  deleted  from  pFGlOSA. 
Also,  this  construct  was  made  in  pUCl9 .   This  3'  deletion  effectively 
removes  770  bp  from  the  3'  flanking  region  of  the  pFGlOSA  H4  gene. 
Monoclonal  cell  lines  of  pF0108X  were  prepared  and  assayed  for 
expression  and  copy  number  as  before.   The  results  of  the  analysis  are 
presented  in  Figures  3-15  and  3-16a,b  and  the  expression  levels  are 
calculated  in  Table  3-2.   The  expression  of  pFOlOSX  was  not 
significantly  different  than  that  of  pFOlOSA  ;  these  results  suggest 
that  the  nucleotides  from  the  Xbal  (+1107  bp  )  site  to  the  Hindlll 
(+1877  bp)  site,  removed  from  the  3'  end  in  pFOlOSX,  had  little  if  any 
effect  on  the  level  of  transcription.   The  construct  pFO006  was  also 
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Figure  3-15    Si  nuclease  analysis  of  pF0006  and  pFOlOSX  monoclonal 
cell  lines. 

SI  analysis  of  pFO006m3 ,4 , 6 , 8 , 11  and  pFO108Xm2 , 3 , 5 , 6 ,9  are  presented. 
25/ig  of  total  cellular  RNA  from  log  phase  cells  was  mixed  with  an 
excess  of  human  and  mouse  H4  histone  probes  and  Si  nuclease  reactions 
electrophoresis,  and  densitometry  were  done  as  described  in  Materials 
and  Methods.  The  human  and  mouse  signals  (280  nt  and  110  nt 
respectively)  are  noted  on  the  right.   The  markers,  M,  and  pBR322 
digested  with  Hpall  and  labelled  with  Klenow  fragment  and  "-3'^P-dCTP. 
The  number  above  the  lane  designates  the  clone  number,  and  the  black 
line  defines  the  construct. 


Figure  3-16    Copy  number  analysis  of  pF0006  and  pFOlOSX  monoclonal 
cell  lines. 

Southern  blot  analysis  was  done  as  described  in  Materials  and  Methods. 
Quantitation  of  A  and  B  (A,  probed  with  human  H4  histone;  B,  Probed 
with  mouse  18S  ribosomal  gene)  was  as  described  before  in  Figure  3-10. 
All  designations  are  as  described  earlier  in  Figure  3-10. 
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Table  3-2.  Quantitation  of  Monoclonal  Cell  Line  Expression 


CLONE^ 

EXP^   CN^ 

EXP/CN^ 

CLONE 

EXP 

CN 

EXP/CN 

K8ml3 

.091 

1 

.091 

002m9 

.036 

ND 

K8ml7 

.180 

1 

.180 

002ml0 

.135 

ND 

K8ml4 

.170 

1 

.170 

002ra2 

.135 

1 

.135 

KBmlB 

.030 

5 

.006 

002m3 

.175 

7 

.025 

KSmS 

.032 

13 

.002 

002m8 

.179 

14 

.013 

K8m9 

.180 

28 

.002 

002m7 

.084 

16 

.005 

AVG® 

.114 

8 

.075 

AVG 

.124 

IC 

.045 

STD^ 

.066 

.077 

STD 

.050 

4 

.053 

108Am7 

.123 

1 

.123 

003ml7 

.100 

.025 

lOSAml 

.013 

1 

.013 

003ml3 

.300 

8 

.038 

108Am5 

.056 

1 

.056 

003m4 

1.800 

10 

.180 

108Ani9 

.210 

1 

.210 

003m5 

.500 

13 

.038 

lOBAmlO 

.110 

1 

.110 

003ml4 

1.770 

21 

.084 

10 8 Ami 2 

.143 

4 

.036 

003ml6 

6.780 

23 

.291 

108Am8 

.410 

4 

.103 

003iii2 

6.500 

41 

.159 

108Am2 

.940 

19 

.049 

003ml5 

.400 

44 

.009 

108Ainl4 

.240 

30 

.008 

003ml 

2.500 

L39 

.018 

AVG 

.249 

7 

.079 

AVG 

2.286 

34 

.094 

STD 

.268 

.061 

STD 

2.433 

.091 

005ml4 

.450 

1 

.450 

007ml 

.234 

ND 

005ml6 

.310 

1 

.310 

007m2 

.208 

ND 

005m7 

.120 

1 

.120 

007m4 

.142 

ND 

OOSmll 

.380 

2 

.190 

007m8 

.034 

ND 

005m6 

.160 

2 

.080 

007m9 

.135 

ND 

005m3 

.780 

2 

.390 

007mlO 

.123 

ND 

005ml3 

.970 

3 

.323 

007ml2 

.957 

ND 

005ml8 

.810 

5 

.162 

005ml7 

.430 

5 

.086 

AVG 

.261 

005ml9 

.630 

22 

.029 

STD 

.313 

005ni5 

1.280 

31 

.041 

AVG 

.546 

6 

.201 

STD 

.354 

.140 

Table  3-2  continued 
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CLONE 


EXP 


CN    EXP/CN 


CLONE 


108Xm2 

.086 

1 

.086 

108Xm3 

.021 

1 

.021 

108Xm9 

.021 

1 

.021 

108Xm5 

.226 

2 

.113 

108Xm6 

.208 

30 

.006 

AVG 

.112 

7 

.049 

STD 

.089 

.042 

EXP   CN   EXP/CN 


004m6 

075 

1 

.075 

004ml4 

250 

1 

.250 

004m2 

050 

11 

.005 

004ni8 

2 

38 

38 

.063 

004mll 

.125 

90 

.001 

004ml0 

175 

154 

.001 

004ml9 

5 

.4 

188 

.029 

004ml 

1 

.27 

252 

.005 

AVG 

1 

.22 

-92 

.054 

STD 

1 

.76 

.079 

006m3 

5.050 

30 

006m6 

.013 

ND 

006m8 

.158 

ND 

006mll 

.102 

ND 

.168 


AVGS 


,091 


004Rm3 

.020 

ND 

004Rm4 

.040 

ND 

004Rm7 

.110 

ND 

004Rm9 

.045 

ND 

004RmlO 
AVG 

.040 
.057 

ND 

STD 

.031 

Autoradiograms  were  scanned  with  a  laser  densitometer  as  described  in 
Chapter  2  and  the  level  of  expression  has  been  determined  and 
presented  here  as  a  ratio  of  the  human  and  mouse  SI  signals. 

a.  The  construct  used  to  make  the  cell  line  and  the  clone  number  that 
designates  that  cell  line. 

b.  Expression  (EXP):  a  ratio  of  the  human  and  mouse  SI  protected 
fragments  as  determined  by  densitometry.  Further  description  of  the 
calculation  and  densitometry  procedures  are  given  in  the  Materials 
and  Methods  section. 

c.  The  copy  number  (CN)  of  the  cell  line  as  determined  by  Southern 
blot  analysis. 

d.  Expression  divided  by  the  copy  number  of  the  cell  line  (EXP/CN). 
The  number  represents  the  level  of  expression  per  copy  of  the 
construct  integrated. 

e.  AVG,  the  average  of  either  EXP  or  EXP/CN. 

f.  STD,  the  standard  deviation  of  the  average  value.  • 

g.  The  data  from  006m3  were  not  included  in  the  average  calculation. 
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assayed  with  the  pFOlOSX,  but  will  be  discussed  later  in  the  chapter 
with  respect  to  a  putative  enhancer  element. 

,    The  data  collected  in  Table  3-2  were  arranged  and  the  constructs 
placed  into  a  rank  order  in  comparison  with  one  another.   This  is 
graphically  presented  in  Figure  3-17.   Both  the  average  level  of 
expression  and  expression/copy  are  presented  with  the  standard 
deviation  of  each  calculation.   The  first  observation  is  that  as  the 
amount  of  5'  sequence  is  extended  out  to  -410  bp  in  construct  pFOOOS 
the  level  of  expression  increases  approximately  3  fold  above  that  of 
pFOlOSA.   There  are  significant  differences  between  expression  and 
expression/copy.   This  is  most  easily  seen  when  copy  number  is  included 
in  the  expression  value  for  pF0003 .   There  is  obviously  a  large 
standard  deviation  in  these  results  and  this  is  probably  a  reflection 
of  the  inaccuracies  inherent  in  the  system  available. 

There  are  statistical  differences  in  spite  of  the  high  variability 
encountered  in  this  assay  system.   To  analyze  the  data  statistically  we 
employed  the  services  of  the  University  of  Florida  Biostatistics  Unit 
and  Dr.  Mike  Conlon.   It  was  decided  that  the  most  powerful  statistical 
test  that  could  be  employed  on  these  data  was  the  Wilcoxon  Rank  Sum 
test.   This  test  and  the  analyses  have  been  described  in  the  Materials  ■ 
and  Methods  section  of  this  work. 

Previously  we  demonstrated  that  the  K8  monoclones  (-155  bp)  and  the 
pFGlOSA  monoclones  (-215  bp)  were  not  significantly  different;  these 
results  suggested  that  there  was  little  contribution  from  the  sequence 
between  -155  bp  and  -215  bp  to  the  level  of  transcription.  When  the 
data  for  the  pF0005  monoclones  were  compared  to  the  K8  monoclones,  the 
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Figure  3-17    Graphic  analysis  of  human  H4  histone  gene  expression  in 
mouse  C127  cells. 

The  average  expression  for  each  group  of  monoclones  is  plotted.  The 
average  expression  (EXP)  and  expression/copy  (EXP/CN)  were  calculated 
in  Table  2  and  are  plotted  here  with  the  standard  deviation  for  each 
average  value  shown  as  a  one-way  error  bar. 
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difference  in  the  average  expression/copy  (pF0005  =  0.201  versus  K8  = 
0.075)  was  significant  at  p  <  0.05.   This  suggested  the  existence  of  a 
positive  regulatory  element  in  the  sequences  from  -215  bp  to  -417  bp. 

Previous  polyclonal  cell  line  analysis  had  demonstrated  a 
difference  between  pF0005  and  the  shorter  deletions  however  the  number 
of  samples  was  small  and  precluded  any  statistical  analysis.   Since 
copy  number  had  been  demonstrated  to  be  an  important  variable,  we 
compared  the  pFOlOSA  and  pF0005  monoclonal  cell  lines  with  copy  numbers 
less  than  10  in  the  Wilcoxon  Rank  Sum  test,  and  found  that  in  this 
group  of  data,  pFOOOS  was  significantly  (p  <  0.05)  higher  in  expression 
than  pFOlOSA.   If  the  entire  data  base  was  utilized,  then  the  two 
constructs  were  not  significantly  different  (p  <  0.1).  This  was 
presumably  the  result  of  several  high  copy  number  cell  lines  that      • 
skewed  the  group.  The  data  were  consistent  with  the  idea  that  a  gradual 
increase  in  the  5'  flanking  sequences  contributes  to  an  increase  in  the 
level  of  human  H4  histone  gene  expression.   The  data  also  suggested 
that  there  might  be  a  positive  regulatory  element  between  -210  bp 
(pFGlOSA)  and  -410  bp  (pFOOOS)  although  it  was  not  clearly  definable. 

Protein/DNA  interactions  in  this  region  of  the  promoter  were 
detected  in  vitro  by  van  Wijnen  et  al .  (1987),  unfortunately  these 
studies  were  not  pursued.   Ln  vivo,  there  were  no  detectable 
protein/DNA  interactions  in  the  -210  bp  to  -410  bp  region  (Pauli  et 
al.,  1987)  of  the  promoter.   We  have  done  preliminary  investigation 
into  the  putative  positive  element  and  our  studies  are  detailed  belc 
In  addition,  the  data  presented  in  Table  3-2  were  reevaluated  with 
respect  to  the  effect  of  copy  number  on  expression;  the  low  copy 


Low. 


108 
number  data,  which  were  most  representative  of  the  results  and  least 
affected  by  the  competition  phenomenon  mentioned  earlier,  are 
discussed  and  analyzed  in  chapter  5. 

Distal  Transcriptional  Regulatory  Elements 

Inspection  of  the  data  in  Table  3-2  and  graphically  presented  in 
Figure  3-17  demonstrates  two  points.   First,  as  the  length  of  the  H4 
promoter  sequence  increases  to  -410  bp  the  average  level  of  expression 
rises.   The  difference  in  expression  between  the  monoclonal  cell  lines 
of  pFOlOSA  and  pF0005  is  statistically  significant  (p  <  0.05).   The 
second  point  is  that  the  expression  of  pF0002  is  significantly  lower 
than  pFO005.   This  result  is  based  on  the  comparison  of  only  4  of  the 
6  monoclonal  cell  lines  which  were  positive  for  expression. 
Unfortunately,  two  of  the  pF0002  monoclones  (pFO002m9  and  pFO002ml0) 
had  no  detectable  EcoRI/Xbal  fragment  in  the  copy  number  experiment. 
We  should  note  that  if  one  assumes  that  these  cell  lines  had. only  1 
copy  of  the  H4  gene  integrated  and  incorporates  all  6  monoclonal  cell 
lines  into  the  data  base,  the  difference  in  expression  between  pF0005 
and  pF0002  is  still  statistically  significant.   The  fact  that  pF0002 
(-1065  bp)  was  lower  in  expression  than  pFOOOS  (-417  bp)  suggested  that 
there  might  be  a  negative  regulatory  element  in  the  more  distal 
sequences  of  the  H4  promoter. 

The  objective  of  the  next  experiments  was  to  determine  if  there 
was  a  negative  regulatory  element  located  between  -410  and  -1065  bp  in 
the  human  H4  histone  gene  promoter.   There  were  several  other  lines  of 
evidence  that  suggested  that  sequences  upstream  of  -410  bp  might 
influence  the  expression  of  this  gene  in  a  negative  fashion.   When  we 
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Figure  3-19     Annotated  sequence  of  the  pF0002  5'  flanking  sequence. 

Both  strands  of  the  sequence  are  shown  from  +70  bp  to  -1010  bp. 
Construct  deletion  points  are  denoted  with  an  asterisk  over  the  last 
base  in  the  clone  and  the  clone  name.  A  number  of  homologies  to  various 
elements  have  been  designated.  The  proximal  210  bp  have  the  ATG,  TATA, 
CAAT,  and  GGTCC  elements  underlined.  Also  Site  I  and  Site  II  are 
bolded.  From  -720  to  -820  bp  a  DNasel  hypersensitive  site  is  denoted 
with  a  string  of  asterisks  above  the  sequence.  Putative  nuclear  matrix 
attachment  sites,  underlined,  are  located  at  -680  bp  and  -940  bp 
(bases  that  do  not  match  the  consensus  sequence  are  bolded) .  A  putative 
topoisomerse  II  site  is  found  at  -881  bp  to  -895  bp  and  has  been 
confirmed  by  Dr.  T.  Rowe  (personal  communication).  Nuclear  matrix 
associated  T-boxes  are  underlined  at  -925  and  -885  bp.  Two  putative 
negative  regulatory  elements  are  underlined  at  -580  and  -710  bp. 
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*002  (-55bp) 

GCAGAATATCCCTCAGTCTTCTCTATGTAGCAGGCCCTCCATATACGCGGGTTCCCCAAG 
-1000     .... 

CGTCTTATAGGGAGTCAGAAGAGATACATCGTCCGGGAGGTATATGCGCCCAAGGGGTTC 

*002D1 

ACCGAAAATATTAAACAAATGAATTTCTTTTTTAAAAAAAAnTArAArAAAAnATAnTAA 
-950      •         •         .         .         .900 

TGGCTTTTATAATTTGTTTACTTAAAGAAAAAATTTTTTTTCATGTTGTTTTCTATCATT 


AAATAAAAACAGTATAACAATTACTTACATAGCTTTACAGACTGGATTGGTGTTCGAAGT 

-850 
TTTATTTTTGTCATATTGTTAATGAATGTATCGAAATGTGTGACCTAACCACAAGCTTCA 

AATTTGAGCTTATTTAAAGTACACGGGAGGATGTGCATAGTTATGTGCAAATACTACCCG 

-800 

TTAAACTCGAATAAATTTCATGTGCCCTCCTACACGTATCAATACACGTTTATGATGGGG 

*002E9 

ACTTTCTATGAGAGACTTGAGCAACCTGATTTTGGTATCGGCGGGGGCCCTGACCAATGC 

-750 

TGAAAGATACTCTCTGAACTCGTTGGACTAAAACCATAGCCGCCCCCGGGACTGGTTAGG 


CCTCTCAGTTCTACCGAGGGAGAACTGTTTTGTTTCTTCCGCACGGCTTTGACGGACAGT 
-700      •         ■         .         . 

GGAGAGTCAAGATGGCTCCCTCTTGACAAAACAAAGAAGGCGTGCCGAAACTGGCTGTCA 


GTGTTGGGATTCGCTGGACCATGAGAAAGCTTGGCAGCATGCTGTGACCGGTTTTCCCAG 
-650      •         ■         .         .         .600 

CACAACCCTAAGCGACCTGGTACTCTTTCGAACCGTCGTACGACACTGGCCAAAAGGGTC 


*007 

GGCCAGAATTCTCCTGTGTGAGCTAAAATACAGTGGCTCGGTCCAACAAAACAGAGCCTG 

-550 
CCGGTCTTAAGAGGACACACTCGATTTTATGTCACCGAGCCAGGTTGTTTTGTCTCGGAC 

GAGCCAGGAATTATGGCGAACCTGCTCCCTCCGTCCTGCTTCGGCGAAGATCCCTGGCGC 

-500 

CTCGGTCCTTAATACCGCTTGGACGAGGGAGGCAGGAGGAAGCCGCTTCTAGGGACCGCG 

*005 

GCGTCCTTGAGGTCGCCTTCGGTGTTGACCTCATCGTCGGAACGGCGCTTCCTGAAGCTT 

-A50 

CGCAGGAACTCCAGCGGAAGCCACAACTGGAGTAGCAGCCTTGCCGCGAAGGACTTCGAA 
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TATATAAGCACGGCTCTGAATCCGCTCGTCGGATTAAATCCTGCGCTGGCGTCCTGCCAG 
-400       •  .  .  . 

ATATATTCGTGCCGAGACTTAGGCGAGCAGCCTAATTTAGGACGCGACCGCAGGACGGTC 


TCTCTCGCTCCATTTGCTCTTCCTGAGGCTCCCTCCAGAGACCTTTCCGTTAGCCTCAGT 
-350      ....         .300 

AGAGAGCGAGGTAAACGAGAAGGACTCCGAGGGAGGTCTCTGGAAAGGGAATCGGAGTCA 


GCGAATGCTTCCGGGCGTCCTCAGAACCAGAGCACAGCGAAAGCCACTACAGAATCCGGA 

-250 
CGCTTACGAAGGCCCGCAGGAGTCTTGGTCTCGTGTCGGTTTCGGTGATGTCTTAGGCCT 

*108A  *L14 

AGCCCGGTTGGGATCTGAATTCTCCCGGGGACCGTTGCGTAGGCGTTAAAAAAAAAAAAG 

-200 

TCGGGCCAACCCTAGACTTAAGAGGGCCCCTGGCAACGCATCCGCAATTTTTTTTTTTTC 

*K8 

AGTGAGAGGGACCTGAGCAGAGTGGAGGAGGAGGGAGAGGAAAACAGAAAAGAAATGACG 

-150 

TCACTCTCCCTGGACTCGTCTCACCTCCTCCTCCCTCTCCTTTTGTCTTTTCTTTAGTGC 

*J50  *J56 

AAATGTCGAGAGGGCGGGGACAATTGAGAACGCTTCCCGCCGGCGCGCTTTCGGTTTTCA 
-100       •  •  •  . 

TTTACAGCTCTCCCGCCCCTGTTAACTCTTGCGAAGGGCGGCCGCGCGAAAGCCAAAAGT 

*J67 

ATCTGGTCCGATACTCTTGTATATCAGGGGAAGACGGTGCTCGCCTTGACAGAAGCTGTC 
-50        •  .  .  •  +1 

TAGACCAGGCTATGAGAACATATAGTCCCCTTCTGCCACGAGCGGAACTGTCTTCGACAG 


TATCGGGCTCCAGCGGTCATGTCCGGCAGAGGAAAGGGCGGAAAAGGCTTAGGCAAAGGG 

+50 

ATAGCCCGAGGTCGCCAGTACAGGCCGTCTCCTTTCCCGCCTTTTCCGAATCCGTTTCCC 
Figure  3-19  continued 
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tried  to  make  cell  lines  with  the  construct  pFOOOl  (an  internal 
deletion  from  -210  to  -610  bp,  see  Figure  3-1),  we  found  very  few  cell   . 
lines  that  expressed  the  transfected  human  H4  gene.   Only  1  of  10 
polyclonal  lines  and  2  of  12  monoclonal  lines  were  positive  for 
expression,  and  these  were  barely  detectable  (data  not  shown).  This 
result  supported  the  idea  that  there  was  a  positive  regulatory  element 
in  the  region  between  -210  and  -417  bp  and,  since  the  expression  was 
very  low  perhaps  a  negative  element  upstream  of  -586  bp.   In  addition, 
polyclonal  lines  of  pF0002  appeared  to  have  lower  expression  than 
polyclonal  lines  of  pFOOOS  (data  not  shown) . 

To  address  the  possibility  of  a  negative  regulatory  element  we 
first  decided  to  sequence  the  region  of  the  promoter  from  -610  bp  to 
-1065  bp.   pF0002  DNA  was  digested  with  either  BamHI ,  EcoRI ,  or 
Hindlll,  treated  with  phosphatase  and  labelled  as  described  in 
Materials  and  Methods  and  the  protocol  of  Maxam  and  Gilbert  (1980) . 
Figure  3-18  schematically  displays  the  strategy  utilized  to  determine 
the  sequence  of  the  upstream  region.   The  sequence  has  several  unusual 
characteristics  and  is  presented  as  part  of  the  entire  pF0002  sequence 
in  Figure  3-19.   The  distal  end  of  the  fragment,  from  -800  bp  to  -960 
bp  is  very  A/T  rich  (70%)  with  several  homopolymeric  runs  of  each.   In 
addition  a  search  of  the  region  revealed  two  sequences  with  strong 
similarity  to  nuclear  matrix  attachment  sites  (-940  bp  and  -680  bp)  and 
associated  T-boxes  (-925  bp  and  -890  bp ,  bottom  strand) (Gasser  and 
Laemmli,  1987).   Near  the  upstream  matrix  site  a  putative 
topoisomerase  site  (-890  bp)  was  identified.   The  presence  of  this 
topoisomerase  site  has  been  confirmed  in  vitro  by  Dr.- Tom  Rowe 
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(personal  communication).   Additionally,  Dr.  Susan  Chrysogelos  of  our 
laboratory  demonstrated  the  presence  of  a  DNAsel  hypersensitive  site 
(-720  to  -820  bp)  between  the  two  putative  nuclear  matrix  attachment 
sites  (personal  communication) .   This  arrangement  of  chromatin 
structure,  nuclear  matrix  sites  flanking  a  nuclease  hypersensitive 
site,  is  very  similar  to  that  demonstrated  previously  by  Gasser  and 
Laemmli  (1986)  and  was  at  least  circumstantial  evidence  that  this 
region  might  be  involved  in  attachment  to  the  nuclear  matrix. 

We  compared  the  entire  5'  flanking  sequence  of  the  F0108  H4 
histone  gene  (-1  to  -1065  bp)  with  the  consensus  sequences  for  several 
groups  of  negative  regulatory  elements  as  described  by  Baniahmad  et 
al .  (1987).   They  compared  the  promoter  sequences  of  a  number  of  genes 
subject  to  negative  regulation  and  determined  two  consensus  elements. 
These  elements  were  termed  Box  1  (5' -ANCCTCTCC-3' )  and  Box  2  (5'- 
ANTCTCCTCC-3' ) .   Good  homologies  to  both  elements  were  found  in  the  H4 
histone  upstream  region  at  -710  bp  (Box  1)  and  -580  bp  (Box  2)  as 
designated  in  Figure  3-19.   Dr.  Susan  Chrysogelos,  of  our  laboratory, 
demonstrated  that  the  region  of  the  H4  promoter  from  -585  to  -1065  bp 
has  middle  repetitive  character  (personal  communication) .   As  mentioned 
in  the  introduction  Laimins  et  al.  (1986)  associated  middle  repetitive 
character  with  some  negative  regulatory  elements  (chicken  lysozyme  gene 
and  rat  insulin  like  growth  gene) . 

To  investigate  whether  there  was  any  functionality  associated  with 
the  two  putative  negative  regulatory  elements  that  were  implicated  via 
similarity  to  previously  identified  negative  elements  we  constructed 
two  deletion  mutants  of  pF0002  in  the  460  bp  BamHI/EcoRI  fragment 
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(Figure  3-20).   The  first  deletion,  pF0002Dl,  was  made  by  digestion  of 
pF0002  with  PstI  and  then  a  partial  digestion  with  Dral .   Tlie  Dral/PstI  . 
(2.16  kb)  fragment  was  isolated  and  cloned  into  the  Smal  site  in  pUC19. 
Before  ligation  the  DNA  was  treated  with  Klenow  fragment  to  blunt  the 
ends  of  the  insert  molecule.   This  construct  effectively  deletes  145  bp 
of  the  5'  sequence  from  -1065  to  -920  bp.   The  second  deletion, 
pF0002E9,  was  prepared  in  a  similar  manner  except  the  initial  digestion 
was  with  Xbal .   The  partial  digestion  was  (after  the  blunt  end  reaction 
with  Klenow)  with  Eco0109.   The  1630  bp  Eco0109/Pstl  fragment  was 
purified  and  the  fragment  was  ligated  to  pUC19  digested  with  Smal  under 
blunt  end  conditions  as  described  in  Materials  and  Methods.   pFO002E9 
(-730)  deletes  335  bp  from  the  5'  end  of  pF0002.   Just  prior  to  the 
construction  of  pF0002Dl  and  E9  we  made  the  construct  pF0007  by  an 
EcoRI  partial  digestion  of  pF0003  linearized  with  PstI.   The  1.84  kb 
Pstl/EcoRI  fragment  was  cloned  into  pUC19  (Figure  3-1).   This  construct 
was  made  to  assess  the  contribution  of  the  200  bp  between  -410  (pF0005) 
and  -610.   It  was  decided  that  instead  of  making  stable  cell  lines, 
which  had  been  a  confusing  endeavor  up  to  this  point,  we  would 
transiently  transfect  C127  cells  with  these  constructs  to  assess  any 
possible  negative  regulatory  effects.   The  transfections  of  pF0002, 
pF0002Dl,  pFO002E9,  pF0007 ,  pFOOOS,  pFGlOSA,  and  pFOOOl  were  done 
according  to  the  protocol  described  in  Materials  and  Methods.   Two  of 
the  SI  nuclease  assays  performed  on  RNA  isolated  from  the  transfected 
cells  are  presented  in  Figure  3-21.   Both  C127  cells  and  Ltk"  cells 
were  utilized  in  this  series  of  experiments.   The  data  from  the  series 
of  transfections,  6  in  total,  are  presented  in  Table  -3-3.   There  was 


Figure  3-21   SI  nuclease  analysis  of  transiently  transfected  C127 
and  Ltk'  cells:  Determination  of  putative  negative 
regulatory  element  position  in  distal  promoter  sequence 
of  the  F0108  H4  histone  gene. 

SI  nuclease  analysis  was  performed  on  total  cellular  RNA  of  both  C127 
and  Ltk"  after  transfection  with  10  /zg  of  each  histone  deletion 
construct  as  described  in  Materials  and  Methods.  50  fig   of  total 
cellular  RNA  was  used  for  each  hybridization  reaction.  The  results  of 
two  transfections  are  presented.   The  human  (280  nt)  and  mouse  (110  nt) 
protected  fragments  are  designated  at  the  right.  Markers  (M) ,  pBR322 
digested  with  Hpall ,  labelled  with  a--''^P-dCTP  and  Klenow  fragment,  and 
pertinent  sizes  are  shown.  Densitometry  of  the  human  and  mouse  signals 
from  autoradiograms  permitted  the  quantitation  of  expression  from  each 
construct.  The  clones  transfected  into  either  C127  cells  (left  panel) 
or  Ltk'  cells  (right  panel)  are  specified  above  each  lane.  These  two 
autoradiograms  are  representative  of  the  6  experiments  that  were 
performed.  An  analysis  of  the  data  is  presented  in  Table  3  and  Figure 
3-22. 
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Table  3-3 

Summary  of 

Transient 

Expression 

Data. 

Experiment 

1 

2 

3 

4 

5 

6 

Cell  Type 

C127 

Ltk" 

Ltk- 

C127 

Ltk- 

C127 

Construct 

Avg  +  SD 

pF0002 

56 

57 

66 

54 

26 

54 

52.1  ±  13.5 

pF0002Dl 

12 

25 

19 

53 

34 

48 

31.8  ±  16.2 

pFO002E9 

100 

100 

100 

100 

100 

100 

100.0 

pF0007 

68 

34 

14 

42 

56 

42.8  +  20.7 

pFOOOS 

9 

72 

42 

49 

23 

52 

41.1  +  22.3 

pFOOOl 

ND 

0.5 

1 

0.5 

ND 

ND 

0.66  ±  0.3 

pFOlOSA 

ND 

2 

4 

2 

ND 

ND 

2.6  +  1.1 

Each  construct  was  transfected  into  both  C127  or  Ltk'  cells  and 
analyzed  by  densitometry  of  the  autoradiograms .  The  amount  of 
transcription  is  expressed  in  percent  of  pFO002E9  expression  since  it 
was  consistently  the  highest  expressed  construct.  The  individual  values 
for  each  experiment  are  listed  and  the  average  expression  (Avg)  for  the 
construct  +  the  standard  deviation  (SD)  is  listed  at  the  right.  Only  5 
transfections  of  pF0007  were  done;  therefore  there  is  no  value  for 
experiment  #5.  ND,  not  determined.  Densitometry  was  only  done  on  three 
of  the  experiments  in  which  pFOOOl  and  pFOlOSA  were  included. 
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Average  of  6  Transient  Assays 
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Figure  3-22    Compilation  analysis  of  6  transient  assays  with  histone 
H4  deletion  constructs:  analysis  of  putative  negative 
regulatory  element. 

The  data  from  all  six  transient  transfection  experiments  was  averaged 
for  each  construct  and  plotted  with  standard  deviation  bars.  The  data, 
as  in  Table  3-3,  are  calculated  as  the  percent  of  pFO002E9  expression 
pF0002,  pF0002Dl,  pFO002E9,  pF0007 ,  and  pFO005  are  included. 
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variability  from  experiment  to  experiment  and  the  average  of  6 
experiments  is  plotted  graphically  in  Figure  3-22.  Plasraid  DNAs  were 
examined  on  agarose  gels  to  determine  the  percent  of  Form  I  and  to 
ensure  that  the  quantitation  was  accurate.   In  both  Table  3-3  and 
Figure  3-22  the  values  for  pF0002 ,  pF0002Dl,  pF0007  and  pF0005  are 
expressed  as  the  percentage  of  pFO002E9,  which  was  consistently  highest 
throughout  the  6  experiments.   The  data  for  pFOOOl  and  pFOlOSA  were  not 
included  in  Figure  3-22  as  they  were  considerably  lower  than  any  of  the 
other  constructs;  however  the  data  are  presented  in  Table  3-3. 

Our  first  observation  was  that  the  level  of  pF0005  expression  was 
very  similar  to  that  of  pF0002  in  apparent  contrast  to  the  data  from 
the  stable  cell  lines  (Table  3-2)  where  pF0005  (-417  bp)  had  a  3  fold 
higher  level  of  expression  than  pF0002  (-1065  bp) .   The  most  likely 
explanation  for  this  appears  to  relate  to  differences  between 
expression  from  stably  integrated  and  episomal  DNA  molecules.   This 
difference  in  the  state  of  the  DNA  may  also  have  affected  the  level  of 
expression  from  pFOOOl  and  pFOlOSA  (discussed  below).   The  original 
hypothesis  was  that  the  consensus  negative  regulatory  sequences 
described  earlier,  Box  1  and  Box  2  (Baniahmad  et  al.,  1987),  were 
responsible  for  the  decrease  in  expression  of  the  pF0002  construct. 
Our  results  demonstrated  that  both  of  the  consensus  negative  regulatory 
elements  (Box  1,  -710  bp  and  Box  2,  -580  bp)  were  located  in  the 
sequences  included  in  the  construct  pFO002E9  and  it  was  the  most  highly 
expressed  construct  of  the  group.   This  result  disproves  the  idea  that 
the  decrease  in  expression  is  due  to  the  proposed  negative  regulatory 
sequences.   However,  when  additional  sequences  are  added  in  pF0002Dl 
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Figure    3-23 


SI  nuclease  analysis:  Comparison  of  pF0007  and  pFOOOS 
monoclonal  cell  lines. 


Analysis  of  total  cellular  RNA  from  each  cell  line  was  as  described  in 
Materials  and  Methods.  Lanes  are  labelled  as  in  previous  SI  analysis 
figures:  M,  pBR322  Hpall  marker;  pF0007  and  pF0005  monoclonal  cell  line 
numbers  are  denoted  above  the  lane;  H,  HeLa  RNA;  C,  C127  RNA.   Both 
human  and  mouse  SI  nuclease  protected  fragments  are  denoted  to  the 
right. 
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and  pF0002  the  expression  is  lower.   If  there  is  a  negative  element, 
and  this  evidence  is  again  only  suggestive  for  one,  it  probably  lies  in 
the  sequences  between  the  Dral  site  (-920)  and  the  Eco0109  site  (-730). 
Interestingly  both  the  topoisomerase  II  site  (-890  bp)  and  the  DNasel 
hypersensitive  site  (-720  to  -820  bp)  are  included  in  this  region  of 
the  promoter. 

In  addition,  we  can  state  that  the  sequences  included  in  the 
construct  pF0007  (-410  to  -610  bp)  do  not  contribute  to  the  level  of 
expression.   In  the  transient  assays  the  expression  from  pF0005  and 
pF0007  was  nearly  identical  (Figure  3-22  and  Table  3-3).   The  pF0007 
construct  was  also  transfected  into  C127  cells  and  monoclonal  cell 
lines  prepared.   We  compared  the  level  of  expression  of  the  two 
constructs  and  found  no  significant  difference.   The  SI  nuclease 
analysis  is  presented  in  Figure  3-23  and  the  expression  data  were 
calculated  as  before  and  displayed  in  Table  3-2. 

The  construct  pFO002E9  was  consistently  expressed  at  a  higher  level 
than  any  of  the  other  constructs.   An  examination  of  the  additional 
sequence  included  in  the  construct  pFO002E9  revealed  a  putative  CCAAT 
box  (-700  bp)  which  matches  the  consensus  sequence  identically. 
Perhaps  this  element  is  responsible  for  the  2  fold  increase  in  the 
level  of  expression.   How  the  CCAAT  box,  normally  a  proximal  promoter 
element,  might  function  in  this  particular  position  is  unknown;  however 
there  is  precedence  for  the  action  of  distal  regulatory  elements 
through  bending  of  the  DNA  molecule  (Ptashne,  1986).   The  results  we 
have  presented  here  are  preliminary,  but  similar  results  have  been 
shown  by  Ken  Wright  of  our  laboratory  with  in  vitro  transcription  of 
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the  same  deletion  constructs  from  circular  templates  in  nuclear 
extracts  (personal  communication) . 

In  the  stable  cell  lines,  the  expression  of  a  particular  construct 
was  apparently  determined  by  the  histone  5'  sequences  and  the  number  of 
copies  integrated.   In  the  transient  assays,  there  was  a  depression  in 
expression  of  pFOOOl  and  pFOlOSA  (Table  3-3)  as  compared  to  pFOOOS. 
This  was  consistent  with  the  results  of  the  stable  cell  lines,  however 
the  effect  was  more  dramatic  in  the  episomal  system.   pFOOOl  expression 
was  low  and  usually  imperceptible  in  stable  cell  lines,  and  consistent 
with  this  result  was  expressed  only  at  a  low  level  in  the  transient 
transfections.   Even  with  the  additional  5'  sequence  that  pFOOOl 
includes  the  internal  deletion  of  the  EcoRI  (-210)/EcoRI  (-610) 
fragment  again  suggested  that  there  was  a  positive  element  in  the  200 
bp  between  -210  and  -410  nt.   An  alternative  explanation  for  these 
results  however  is  possible.   pFOOOl  and  pFOlOSA  are  different  from  the 
other  deletion  constructs  in  that  they  are  both  pBR322  clones  whereas 
the  others  are  all  derivatives  of  pUC  plasmids.   Perhaps  this 
difference  in  the  length  and  composition  of  the  vector  was  more 
dramatically  accentuated  in  the  transient  assay  system.   In  stable  cell 
lines,  a  comparison  of  the  3'  deletions  pFOlOSA  and  pFOlOSX  revealed  no 
significant  differences  (p  <  0.1)  in  the  level  of  expression  (Table  3- 
2).   If  anything,  pFOlOSX  was  slightly  lower  in  expression  and  is  a 
pUC19  clone.   Perhaps  the  functionality  of  these  upstream  elements 
relies  on  a  particular  chromatin  structure  of  the  region  which  is  only 
obtained  when  the  constructs  are  integrated. 


Figure  3-24     Schematic  diagram  for  the  production  of  unidirectional 
deletions  with  Exonuclease  III  and  Mung  Bean  Nuclease. 

The  original  construct  pFO005B5  was  made  by  insertion  of  the  pF0005 
insert  into  the  PstI  and  Hindlll  sites  of  Bluescript  M13+.  This 
construct  was  digested  with  Apal  which  produces  a  3'  overhang  and 
Hindlll,  a  5'  overhang  as  shown.   Next  30  units  of  Exonuclease  III  were 
added  and  aliquots  removed  every  minute  for  5  minutes.  The  DNA  aliquots 
were  diluted  in  Mung  Bean  nuclease  buffer  and  9  units  of  Mung  bean 
nuclease  was  added.  The  DNA  was  then  ligated  under  blunt  end  conditions 
and  transfected  into  competent  DH5  cells.  The  complete  protocol  is 
detailed  in  Materials  and  Methods.  The  resulting  products  of  this 
reaction  are  unidirectional  deletions,  because  Exonuclease  III  is 
unable  to  digest  a  3'single  strand  overhang  (Apal).  Restriction  enzyme 
sites  are  designated  as  PstI,  P;  EcoRI,  E;  Hindlll,  H;  Apal,  A;  and 
Kpnl,  K. 
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Distal-proximal  positive  element 
Our  results  indicated  that  the  sequences  from  -210  to  -410  bp  might 
enhance  the  level  of  H4  gene  expression.   To  address  this  question 
deletion  mutants  of  pFOOOS  were  prepared.   The  pF0005  insert  was 
subcloned  into  the  Bluescript  M13+  vector  and  deletions  with 
Exonuclease  III  were  prepared  as  described  in  Materials  and  Methods. 
The  procedure  is  schematically  displayed  in  Figure  3-24.   Ideally,  the 
protocol  should  have  produced  unidirectional  deletions  from  the  Hindlll 
site  at  -410  bp  toward  the  EcoRI  site  at  -210  bp.   This  method  relied 
on  the  inability  of  Exonuclease  III  to  digest  a  3'  overhang  (Apal) . 
However  the  protocol  worked  very  poorly,  and  only  2  deletions  were 
obtained  in  the  region  of  interest  (-210  to  -410  nt) .   These  were 
denoted  pFO005BSdel2-6  (-285  bp)  and  pFO005BSdel2-10  (-335  bp)  . 
Monoclonal  cell  lines  of  pF0005BS,  pFO005BSdel2-10  (2-10),  and 
pFO005BSdel2-6  (2-6)  were  prepared  and  assayed  by  Si  nuclease  analysis. 
Four  of  12  monoclones  were  positive  for  pF0005BS .   Five  of  12  and  1  of 
12  were  positive  for  2-10  and  2-6  respectively.   The  SI  analysis  of 
these  cell  lines  (Figure  3-25)  was  repeated  2  times  and  the  average 
expression  from  both  pFOOOSBS  and  2-10  was  identical  (1.5  ±  1.4).   This 
value  represents  only  the  absolute  amount  of  the  human  SI  protected 
fragment,  as  measured  by  the  densitometer,  averaged  for  each 
construct.   Since  only  a  single  monoclone  was  positive  for  2-6  it  was 
not  included  in  the  analysis.   The  mouse  H4  SI  protected  fragment 
presented  an  unusual  pattern  even  upon  repetition  and  was  not  included 
in  the  analysis.   The  control  Si  nuclease  analysis  of  C127  RNA  worked 
well  (Figure  3-25)  but  the  sample  C127  protected  bands  were  always 
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Figure  3-25     SI  analysis  of  pFOOOSBS  and  Exonuclease  III  delet 


ions 


SI  nuclease  assays  were  done  as  described  in  Materials  and  Methods.  The 
clone  number  of  each  cell  line  is  denoted  above  the  lane  M  pBR322 
digested  with  Hpall  and  labelled  with  a-32p.dCTP  and  Klenow' fragment 
H  HeLa  total  cellular  RNA.  C,  C127  total  cellular  RNA.  The  human  (280 
nt)  and  mouse  (110  nt)  SI  nuclease  protected  fragments  are  denoted  at 
the  left. 
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lower  or  not  detectable  even  when  the  human  H4  protected  fragment  was 
easily  seen.   We  have  noticed  this  occasionally,  but  never  in  so  many 
samples  at  once.   Occasionally,  high  copy  number  cell  lines  or  the  HeLa 
total  cellular  RNA  control  have  exhibited  a  similar  pattern  of 
hybridization,  but  there  has  been  no  consistency. 

The  results  of  transient  assays  of  pF0005BS ,  2-10,  and  2-6  were 
inconsistent  due  to  quantitation  problems  with  the  plasmid  DNAs  that 
were  not  discovered  until  after  the  completion  of  the  analysis.   Even 
with  these  problems  the  results  of  the  transient  assays  supported  the 
idea  that  pF0005  and  pF0005BS  were  the  same  with  respect  to  the  level 
of  expression  (data  not  shown) .   Even  though  more  DNA  was  added  in  the 
transfection  than  originally  thought,  we  were  able  to  conclude  that  2- 
10  (-335  bp)  was  not  significantly  different  than  pFOOOSBS  or  pF0005. 
Unfortunately,  we  were  not  able  to  make  any  conclusions  about  the  2-6 
construct  from  the  transient  assays.   We  can  only  say  that  in  stable 
cell  lines  it  was  expressed  at  a  low  level  in  the  single  monoclone  of 
12  that  was  positive.   The  results  support  the  contention  that  removal 
of  80  bp  (2-10)  from  pF0005  has  little  effect  on  the  level  of 
expression.   The  transcriptional  analysis  of  these  deletions  has  been 
repeated  by  Ken  Wright  with  HeLa  nuclear  extracts  in  vitro.   He  has 
reached  similar  results  to  those  presented  here  (personal 
communication) . 

Enhancer  Element 
Dr.  Sherron  Helms  of  our  laboratory  had  previously  identified  the 
distal  EcoRI/EcoRI  fragment  ( -6 . 0  to  -  7  .  5  kb)  ,  designated  pFOllS,  of 
the  A  HHG41  clone  as  a  possible  enhancer  element  (Helms  et  al . ,  1987). 
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Figure  3-26     SI  nuclease  analysis  of  pF0004  monoclonal  cell  lines. 

SI  nuclease  assays  were  performed  as  described  in  Materials  and 
Methods.   The  number  of  each  clone  is  displayed  above  the  lane  and 
below  the  construct  designation  line.   M,  pBR322  digested  with  Hpall 
and  labelled  with  a-'^'^P-dCTP  and  Klenow  fragment.   H,  HeLa  total 
cellular  RNA.  C,  C127  total  cellular  RNA.  Both  human  (280  nt)  and  mouse 
(110  nt)  HA  protected  fragments  are  designated.  Pertinent  markers  are 
noted. 
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Figure   3-27 


SI  nuclease  analysis  of  pF0004R  monoclonal  cell  lines, 


SI  nuclease  assays  were  performed  as  described  in  Materials  and 
Methods.  M,  pBR322  DNA  digested  with  Hpall  and  labelled  a-^^P-dCTP  and 
Klenow  fragment.  H,  HeLa  total  cellular  RNA.  C,  C127  total  cellular 
RNA.  Clone  numbers  (1-10)  are  designated  above  each  lane.   Both  human 
and  mouse  H4  protected  fragments  are  noted  at  the  right. 


Figure  3-28     Copy  number  analysis  of  pFO004  monoclonal  cell  lines. 

Southern  blot  analysis  was  performed  as  described  in  Materials  and 
Methods.   10  pg  of  DNA  from  each  cell  line  were  analyzed  with  nick 
translated  EcoRI/Xbal  fragment  from  pF0002 .  A.  pF0004  cell  line  DNA 
probed  with  H4  sequences.  B.  The  histone  probe  was  removed  and  the  blot 
was  reprobed  with  the  mouse  18S  ribosomal  fragment.  Densitometry  of  the 
1070  bp  band  specified  by  the  arrow  in  A  and  the  IBS  ribosomal  band  in 
B  permitted  quantitation  of  the  copy  number  through  normalization  to 
the  amount  of  DNA  actually  loaded  and  transferred  as  described  in  the 
Materials  and  Methods.  The  figure  in  A  is  a  composite  of  several 
exposures  that  reflects  the  actual  copy  number  and  accounts  for 
original  quantitation  errors.  The  plasmid  controls  for  quantitation  are 
labelled  10,  50  and  100  designating  the  number  of  pg  loaded.   C,  C127 
cellular  DNA.   H,  HeLa  cellular  DNA.  M,  A  DNA  digested  with  EcoRI  and 
Hind  III  and  labelled  a-^^p.^CTP  and  Klenow  fragment.  Clones  are 
designated  with  their  number  above  each  lane.  Nonessential  lanes  in  B 
have  been  omitted. 
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She  demonstrated  that  linkage  of  this  fragment  to  the  3'  end  of  the 
CAT  gene  in  pSVlCAT  (Gorman  et  al.,  1982)  increased  the  expression  of 
the  CAT  gene  4  to  5  times  in  transiently  transfected  HeLa  cells. 
Enhancers  can  be  located  at  considerable  distances  from  the  gene  that 
they  effect.   The  chicken  lysozyme  gene  enhancer  is  located  7  kb 
upstream  of  the  gene  (Theisen  et  al . ,  1986).   To  investigate  this 
result  further  a  series  of  constructs  were  made  and  assayed  in  stable 
cell  lines.   The  constructs  pF0004,  pF004R,  and  pF0006  are  depicted 
schematically  in  Figure  3-1  and  were  constructed  by  standard  cloning 
procedures.   pF0004  fused  the  pF0116  fragment  to  the  pFOlOSP  construct 
in  the  genomic  orientation.   The  construct  pF0004R  was  made  to  reverse 
the  orientation,  and  pF0006  linked  the  500  bp  EcoRI/Xbal  fragment  of 
pF0116  to  pFGlOSX. 

The  constructs  were  transfected  into  mouse  C127  cells  and  selected 
for  the  growth  of  monoclonal  cell  lines.   The  SI  nuclease  analysis  of 
pF0004,  pF0004R,  pF0006  and  the  control  cell  line  pF0108X  are  presented 
in  Figures  3-26,  3-27,  and  3-15  respectively.   The  expression  data  are 
presented  in  Table  3-2.   The  only  cell  lines  with  significantly  higher 
levels  of  expression  than  pF0108X  contained  the  pF0004  construct.   To 
determine  if  this  was  truly  the  result  of  an  enhancement  or  a 
phenomenon  of  copy  number,  the  latter  was  determined  by  Southern  blot 
analysis  (Figure  3-28)  as  described  previously  in  Materials  and 
Methods.   When  the  pFO004  cell  line  copy  numbers  were  included  in  the 
expression/copy  ratio,  the  level  of  expression  dropped  to  control 
(pFO108X)  level.   Since  neither  pF0004R  nor  pF0006  had  a  significant 
difference  in  expression  from  pF0108X,  the  copy  number  for  these  cell 
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lines  was  not  determined  except  for  pFO006ra3 .   This  cell  line 
presented  the  unusual  mouse  SI  nuclease  protected  fragment  seen  with 
the  pF0005BS  cell  lines  and  therefore  had  high  expression.   The  copy 
number  was  determined  to  exclude  the  possibility  that  this  was  an 
enhancer  effect.   pFO006m3  was  included  on  the  pFOlOSX  copy  number  blot 
(Figure  3-16),  and  from  this  blot  it  was  determined  that  pFO006m3  has 
approximately  30  integrated  copies ,  which  accounted  for  the  higher 
level  of  expression. 

The  lack  of  an  enhancer  effect  by  pF0ll5  was  surprising  in  light  of 
the  previous  demonstration  of  the  effects  on  CAT  gene  expression.   The 
sequence  for  the  entire  pF0116  fragment  was  determined  by  Ken  Wright 
and  Urs  Pauli  of  our  laboratory  and  the  proximal  EcoRI/Xbal  fragment 
contains  three  sequences  with  similarity  to  the  consensus  core  enhancer 
element  (5' -TGTGGAAA-3' )  as  described  for  the  Ig  heavy  chain  and  SV40 
enhancers  (Wasylyk  and  Wasylyk,  1987;  Khoury  and  Gruss,  1983).   The 
presence  of  this  sequence  has  been  shown  not  to  be  solely  sufficient  or 
necessary  for  enhancer  activity  in  the  IgH  enhancer  (Wasylyk  and 
Wasylyk,  1987;  Kadesh  et  al . ,  1986).   The  reasons  for  a  lack  of 
activity  in  mouse  C127  cells,  and  activity  in  HeLa  cells,  is  purely 
speculative.   Certainly  differences  in  the  proteins  that  interact  with 
enhancers  in  different  tissue  types  have  been  documented  (Maniatis  et 
al.,  1987;  Davidson  et  al . ,  1986).   The  evidence  from  the  stable  cell 
lines  we  have  prepared  does  not  support  the  idea  that  the  pF0116 
fragment  enhances  or  augments  the  expression  of  the  F0108  H4  histone 
gene  when  stably  integrated  in  a  mouse  cell.   The  fact  that  the  pF0004 
monoclonal  cell  lines  had  such  a  high  average  copy  number  has  been 
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investigated  further  in  Chapter  4  with  respect  to  specificity  and  mode 
of  integration.   Ken  Wright  of  our  laboratory  has  also  demonstrated 
that  the  pFOllG  fragment  is  unable  to  enhance  the  transcription  in 
vitro  of  the  human  H4  gene  (personal  communication) . 

The  contribution  of  promoter  sequences  to  the  expression  of  the 
F0108  H4  histone  gene  has  been  determined  in  stable  cell  lines. 
Initially  this  approach  appeared  to  be  the  most  accurate  way  to 
determine  functionality;  however,  in  retrospect  there  are  a  number  of 
variables  which  can  not  be  accounted  for.   A  brief  comparison  of  the 
results  from  the  transient  assays  done  to  assess  the  negative 
regulatory  element  hypothesis  indicates  the  heterogeneity  that  can 
occur  in  the  results  and  their  interpretation  as  a  result  of  the 
methodology  utilized  to  perform  the  experiment.   We  originally  thought 
that  the  mouse  H4  SI  nuclease  probe  would  be  the  ideal  internal 
control,  but  subsequently  we  have  realized  that  it  has  faults  for  which 
we  cannot  correct  in  our  interpretation  of  the  results.   The 
possibility  exists  that  limiting  transcription  factors  are  present  in 
only  sufficient  amounts  to  transcribe  the  mouse  H4  genes  present  in  the 
cell.   The  introduction  of  the  human  H4  genes  into  the  genome  of  the 
mouse  cell  likely  disturbs  this  equilibrium.   The  possibilities  for 
misinterpretation  are  considerable.   If,  at  low  copy  number,  the  human 
H4  genes  do  not  effectively  compete  for  mouse  transcription  factors 
then  we  have  probably  underestimated  their  relative  expression.   At 
high  copy  number  it  is  quite  apparent  that  the  expression/copy  ratio 
decreases.   Based  on  what  we  have  presented  here  and  later  in  Chapter  4 
we  will  formally  assess  the  results  of  the  transcription  data  in 


139 
Chapter  5  with  respect  to  the  low  copy  number  cell  lines  only. 
Although  this  limited  my  data  base  it  appeared  to  be  the  only 
reasonable  way  to  proceed  in  order  to  fairly  evaluate  the  data  we  have 
collected. 

Nuclear  run-on  analysis  of  H4  transcription 
This  section  of  the  results  is  added  purely  as  a  note  to  those  who 
might  try  similar  experiments  as  described  below.   None  of  the 
experiments  we  have  described  above  directly  assess  the  level  of 
transcription.   Differences  in  the  5'  region  of  the  promoter  were 
assayed  in  log  phase  cells  under  the  assumption  that  the  mRNA  was  the 
same  for  all  constructs,  therefore  any  differences  in  the  level  of  mRNA 
were  a  reflection  of  transcription.   This  interpretation  is  fine  and 
holds  up  reasonably  well  when  deletion  constructs  are  compared  to  one 
another.   However,  to  examine  transcription  directly  it  is  necessary  to 
eliminate  the  mRNA  stability  variable  in  histone  gene  metabolism.   Our 
laboratory  has  utilized  nuclear  run-on  transcription  to  identify  the 
time  and  extent  of  human  histone  gene  transcription  during  the  cell 
cycle  (Baumbach  et  al . ,  1987).   We  felt  that  our  monoclonal  cell  lines 
would  be  ideal  candidates  for  such  an  analysis  and  that  we  could 
determine  the  region  of  the  promoter  responsible  for  the  3-5  fold 
increase  in  the  level  of  transcription  during  S -phase  of  the  cell 
cycle.   Briefly,  nuclei  were  isolated  from  the  cells  at  4°C  and 
transcription  allowed  to  continue  in  the  presence  of  a--''^P-UTP  for  30 
minutes.   The  labelled  RNA  was  purified  and  used  to  probe  blots  that 
had  plasmid  DNAs  immobilized  (excess  DNA  hybridization).  In  short, 
regardless  of  the  temperature,  salt  concentration,  or  aqueous  state  of 
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the  hybridization  reaction  we  were  able  to  observe  only  mouse  H4  mRNA 
cross  hybridization  to  the  human  H4  plasmid  DNA  (data  not  shown) .   An 
alternative  approach  to  the  detection  problem  was  tried  --  what  we 
called  a  "reverse  Si  analysis".   The  labelled  nuclear  run-on 
transcripts  were  incubated  with  cold  probe  RNA  made  from  the  T3 
promoter  of  a  Bluescript  clone  of  pF0002.   This  was  then  digested  with 
SI  nuclease,  electrophoresed,  and  visualized  as  usual.   As  a  control, 
the  probe  pF0002  RNA  was  labelled  with  a-^^p.uxp  and  hybridized  to  HeLa 
total  cellular  RNA.   The  control  worked  well,  but  the  test  reaction  was 
only  a  smear  (data  not  shown) .   This  result  had  previously  been 
predicted  by  my  outside  examiner  Dr. Barbara  Sollner-Webb.  She  felt  that 
the  technique  would  not  work  because  of  stable  double  stranded 
ribosomal  RNA  that  would  be  labelled  and  obscure  the  histone  signal. 
We  decided  that  unfortunately  this  approach  was  not  possible  in  our 
system. 


CHAPTER  4 
PLASMID  INTEGRATION  SITES,  INTEGRITY  AND  PROTEIN/DNA  INTERACTIONS 

One  goal  of  modern  molecular  biology  is  to  understand  the  molecular 
events  that  occur  during  the  integration  of  exogenous  DNA  into  the 
chromosome  of  a  cell.   These  processes  have  been  examined  in  detail  by 
several  investigators  and  are  important  for  the  study  of  biological 
problems  in  eukaryotic  cells  (Loyter  et  al . ,  1982,  Perucho  et  al . , 
1980,  Folger  et  al . ,  1982,  Lin  and  Sternberg,  1984).   The  problem  of 
what  happens  to  the  DNA  molecules  once  they  enter  the  cell  is 
intriguing  as  it  gives  a  glimpse  of  the  complicated  recombinational 
processes  that  occur  inside  the  cell.   Loyter  et  al.  (1982), 
demonstrated  that  there  was  a  limit  to  the  amount  of  DNA  a  plate  of 
cells  could  take  up,  and  that  only  a  small  percentage  of  the  DNA  that 
entered  the  cytoplasm  subsequently  entered  the  nucleus.   Previous  work 
by  Perucho  et  al .  (1980,  1981)  demonstrated  that  as  foreign  DNA  (e.g. 
plasmid  DNA  with  a  gene  of  interest)  entered  the  nucleus  of  a  cell  it 
became  recombinationally  active.   Because  there  is  usually  little  or  no 
homology  between  the  foreign  DNA  and  the  cellular  DNA  the  first 
recombination  events  that  occur  are  between  the  plasmid  DNA  molecules 
and  carrier  DNA.   Cointegrates  form  in  the  nucleus  shortly  after  the 
introduction  of  the  DNA  into  the  cells.   These  very  large  circular 
molecules  contain  many  plasmid  molecules  arranged  in  a  head- to- tail 
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manner.  When  integration  into  the  host  chromosome  does  occur,  a  large 
number  of  plasmid  molecules  are  likely  to  integrate  stably  in  a  head- 
to-tail  fashion  at  a  single  location  (Perucho  et  al . ,  1980).   For  my 
purposes  it  was  necessary  to  assess  the  integrity  of  the  integrated 
histone  deletion  constructs,  the  structural  relationship  to  the 
cotransf acted  plasmid  pSV2neo,  and  the  mode  of  integration. 

Intesritv  of  Flanking  Sequences 
To  assess  the  intactness  of  the  proximal  flanking  sequences ,  we 
examined  the  copy  number  blots  of  the  constructs  with  210  bp  or  less  of 
5'  flanking  sequence.   In  the  constructs  J67  (-47  bp) ,  J56  (-73  bp) , 
J50  (-100  bp),  K8  (-155  bp) ,  L14  (-185  bp)  and  pFGlOSA  (-215  bp) 
(Figures  3-4,  3-6,  3-10,  3-14)  the  EcoRI/Xbal  fragment  represents  the 
entire  coding  and  5'  flanking  sequences.   The  EcoRI/Xbal  digest  was 
originally  chosen  because  it  is  a  fragment  common  to  all  the 
constructs  used  in  the  study.   It  was  easily  determined  by  inspection 
of  these  Southern  blots  that  all  or  nearly  all  of  the  H4  constructs 
were  integrated  in  a  manner  that  permitted  detection  of  the  human  H4 
insert  sequences  of  the  original  plasmid.   The  integrity  of  longer 
constructs  such  as  pF0003  and  pF0004  was  not  measurable  this  way.   To 
assess  the  integrity  of  the  pF0003  flanking  sequences  the  genomic  DNA   • 
from  a  polyclonal  cell  line  pFO003p3  was  digested  with  Xbal .   This 
digestion  defines  the  entire  7 . 5  kb  insert.   The  restricted  DNA  was 
electrophoresed  on  a  1%  agarose  gel,  blotted,  and  probed  with  the 
EcoRI/Xbal  fragment  from  pF0002.   The  results,  presented  in  Figure  4- 
la,  lane  4,  demonstrate  the  predominance  of  a  7.5  kb  band  that 
corresponds  to  the  presence  of  the  entire  pF0003  insert.   Genomic  DNA 
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Figure  4-1     Southern  Blot  analysis  of  genomic  DNA  from  polyclonal 
cell  lines:  assessment  of  flanking  and  coding  sequence 
integrity  in  pF0003  and  pF0004. 

Ten  micrograms  of  genomic  DNA  from  polyclonal  cell  lines  pFO003p3  and 
pFO004p2  were  digested  with  either  Xhol  or  Xbal ,  electrophoresed, 
blotted,  and  probed  as  described  in  Materials  and  Methods.  A.  Lane  1,  A 
DNA  digested  with  Hindlll  and  labelled  with  Q-^^p.^CTP  and  Klenow 
fragment;  Lane  2,  Xhol  digested  pFO003p3  DNA;  Lane  3,  Xhol  digested 
pF0004p2  DNA;  Lane  4,  Xbal  digested  pFO003p3  DNA;  Lane  5,  Xbal  digested 
pFO004p2  DNA.   The  blot  was  probed  with  the  histone  H4  EcoRI/Xbal 
fragment  purified  from  pF0002  and  nick  translated  as  described  in 
Materials  and  Methods.  In  lanes  4  and  5  the  expected  size  fragments  are 
noted  with  arrows  at  the  right.  B.   Lanes  1-4  are  the  same  as  2-5  in  A. 
Lane  5,  A  DNA  digested  with  Hindlll  and  labelled  with  a-^^p.^jCTP  and 
Klenow  fragment.  The  blot  in  B  was  probed  with  nick  translated  pUC8 
DNA.  The  2.7  kb  band  in  lane  3  (pFO003p3)  is  linear  pUC8. 
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from  a  polyclonal  cell  line  of  pF0004  was  digested  with  Xbal  and  it  was 
established  that  a  large  percentage  of  the  1 . 6  kb  Xbal/Xbal  fragment 
that  includes  the  coding  region  and  much  of  the  flanking  sequences  was 
detectable  as  an  intact  fragment  (Figure  4-la,  lane  5).   To  determine 
that  the  DNA  was  indeed  integrated  we  digested  both  pFO003p3  and 
pFO004p2  with  Xhol ,  an  enzyme  that  has  no  sites  within  either  plasmid. 
Figure  4-1,  lanes  2  and  3,  demonstrate  that  when  the  DNA  is  digested 
with  Xhol  almost  all  of  the  hybridization  to  the  human  histone  probe  is 
in  the  region  of  the  blot  that  corresponds  to  very  high  molecular 
weight  DNA.   Evidence  for  tandem  integration  was  found  when  the  blot  in 
4-la  was  reprobed  with  pUC8  DNA.   In  Figure  4-lb,  lanes  1  and  2  still 
demonstrate  high  molecular  weight  DNA  as  expected.   Lane  3  has  a 
predominant  2 . 7  kb  band  that  is  probably  pUCl3.   The  fact  that  both  the 
pF0003  insert  (7.5  kb)  and  vector  (2.7  kb)  bands  were  so  readily 
detectable  was  indicative  of  tandem  integration.   Unexpectedly  the 
pFO004p2  DNA  did  not  have  a  similar  2 . 7  kb  band.   Instead  there  was  a 
heterogeneous  pattern  of  hybridization  to  the  pUC8  DNA  observed  in 
Figure  4-lb,  lane  5. 

These  experiments  were  pursued  further  to  establish  the  mode  of  .:^^ 

integration  that  had  occurred  in  the  monoclonal  cell  lines.   This 
information  would  allow  one  perhaps  to  understand  how,  or  if,  the 
arrangement  of  histone  insert  sequences  with  respect  to  each  other 
affects  expression.   Our  concern  has  been  how  to  interpret  the  copy 
number  data  with  respect  to  expression.   The  possibilities  are 
considerable  that  tandem  integration,  for  example,  might  "protect" 
internal  integrates  from  chromosomal  effects  in  cis .   Is  it  possible  to 
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assume  that  all  the  genes  integrated  in  a  cluster  are  going  to 
function  equally  well?   When  polyclonal  cell  line  DNA  from  108Ap2  was 
digested  with  PstI  and  analyzed  by  Southern  blotting,  it  was  evident 
that  a  single  band  of  6.2  kb  was  present  and  corresponded  to  full 
length  plasmid  DNA  (data  not  shown) .   Therefore  integration  appeared  to 
have  occurred  in  a  tandem  fashion.   The  detection  of  such  a  high 
percentage  of  the  EcoRI/Xbal  fragment  in  the  copy  number  blots 
mentioned  above  also  demonstrated  that  tandem  integration  was  probably 
the  pathway  utilized. 

We  analyzed  several  other  monoclonal  cell  lines  with  different 
restriction  enzymes  and  Southern  blot  analysis  to  establish  that  tandem 
integration  was  a  general  phenomenon.   In  Figure  4-2,  lanes  3  and  4 
demonstrate  that  when  genomic  DNA  from  the  monoclonal  cell  line 
pF0003ml  was  digested  with  PstI  a  predominant  10.2  kb  band  was  detected 
following  hybridization  with  an  oligo- labelled  3'  noncoding  Xbal/HincII 
fragment  of  pF0002.   Lane  3  is  just  a  lighter  exposure  of  lane  4.  We 
concluded  that  tandem  integration  was  apparently  the  mechanism  used  by 
most  constructs.   This  human  H4  histone  gene  3'  probe  permitted 
detection  of  only  human  histone  sequences  since  it  contained  no  coding 
region.   In  Figure  4-2,  lane  9,  pFO005m5  genomic  DNA  was  digested  with 
BamHI,  again  an  enzyme  that  linearizes  the  construct.   Two  bands  were 
detected  with  the  histone  3'  probe,  linear  pFOOOS  (4.3  kb)  and  a 
slightly  higher  band  that  was  not  identified.  This  pointed  toward 
tandem  integration,  and  limited  heterogeneity  of  integration  sites. 
Digestion  of  pFOlOSAmlO  genomic  DNA  with  BamHI  (Figure  4-2,  lane  13) 
demonstrated  more  heterogeneity  although  the  correct  size 


Figure  4-2     Southern  Blot  analysis  of  monoclonal  cell  line 

integration  pattern  and  location  of  pSV2neo  sequences. 

This  figure  is  a  composite  of  the  same  blot  that  has  been  probed  with 
two  different  DNA  fragments.  The  blot  was  first  probed  with  a  3' 
fragment  from  the  F0108  H4  histone  gene  as  described  in  the  text,  and 
Material  and  Methods.  This  probe  was  removed  and  the  blot  was  probed 
for  a  second  time  with  a  fragment  that  contains  the  SV40  enhancer.  The 
complementary  lanes  from  each  analysis  have  been  placed  next  to  each 
other  to  facilitate  comparison  of  the  data.  Lanes  2-4,  6,  7,  9,  11,  13, 
and  15  were  all  probed  with  the  3'  histone  H4  fragment.  Lanes  5,  8,  10, 
12,  14  and  16  were  probed  with  the  SV40  fragment.  Lanes:  1,  A  DNA 
digested  with  Hindlll/EcoRI  and  Klenow  labelled;  2,  HeLa  DNA  digested 
with  Xbal  and  a  7.5  kb  band  is  detected.  Lanes  3  and  4,  pF0003ml  DNA 
digested  with  Pstl.  The  10.2  kb  linear  pF0003  molecule  is  denoted  at 
the  left  (3  is  a  shorter  exposure  of  lane  4).  Lane  5,  pF0003ml  DNA 
digested  with  Pstl  and  the  2 . 3  kb  Pstl  pSV2neo  fragment  is  indicated  at 
the  right  by  an  arrow.  Lanes  6  and  7,  pF0004Mll  DNA  digested  with  Pst 
1.   A  5 . 7  kb  linear  band  is  detected  and  indicated  (lane  6  is  a  shorter 
exposure  of  lane  7).  Lane  8,  pF0004mll  DNA  digested  with  Pstl.  The  2.3 
kb  Pstl  fragment  of  pSV2neo  is  denoted.   Lanes  9  and  10,  pFO005m5  DNA 
digested  with  BamHl.  In  lane  9  a  4.3  kb  linear  pF0005  band  is  denoted. 
In  lane  10,  the  linear  5 . 5  kb  pSV2neo  band  is  indicated.  Lanes  11  and 
12,  pFO005m5  DNA  digested  with  Pstl.   In  lane  11  a  1.7  kb  band 
corresponding  to  the  entire  pF0005  insert  is  detected.  In  lane  12,  the 
2.3  kb  Pstl  fragment  of  pSV2neo  is  noted.  Lanes  13  and  14,  pFOlOSAmlO 
DNA  digested  with  BamHI .  In  lane  13  the  6 . 2  kb  band  of  linear  pFOlOSA 
is  noted.  In  lane  14  the  5 . 5  kb  pSV2neo  band  is  noted.  Lanes  15  and  15, 
pFOlOSAmlO  digested  with  Pstl.  In  lane  15  the  2 . 2  kb  band  corresponding 
to  most  of  the  pF0108A  insert  is  detected.  In  lane  16  a  2.3  kb  pSV2neo 
band  is  detected  as  expected. 
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linear  fragment  was  detectable  (6.2  kb) .   When  pFOOOSmll  and 
pFOlOSAmlO  DNA  were  digested  with  PstI,  a  fragment  of  the  expected  size 
was  readily  detectable  (Figure  4-2,  lanes  11  and  15,  respectively). 
These  results  were  consistent  with  tandem  integration,  perhaps  in 
several  locations. 

We  noticed  early  in  these  studies  that  cell  lines  that  contained 
the  construct  pF0004  had  a  heterogeneous  pattern  of  integration.   When 
pF0004mll  genomic  DNA  was  digested  with  PstI,  an  enzyme  that  linearizes 
the  construct,  there  were  fewer  linear  molecules  (5.7  kb)  detectable. 
Figure  4-2,  lanes  6  and  7.   Lane  6  is  a  lighter  exposure  of  lane  7. 
This  increase  in  the  heterogeneity  of  integration  was  associated  with 
the  presence  of  an  Alu  repeat  sequence  in  the  5'  flanking  region  of 
this  construct.   We  examined  the  copy  number  data  and  calculated  the 
average  copy  number  of  each  type  of  cell  line  (Table  3-2)  and  were  able 
to  correlate  the  presence  of  repeated  sequences  with  increased  average 
copy  number.   Previous  work  on  repeated  sequences  associated  with 
histone  gene  clusters  (Collart  et  al.,  1985)  had  demonstrated  the 
presence  of  a  strong  Alu  repeat  in  the  most  distal  EcoRI/Xbal  fragment 
(-5.5  to  -5.5  kb)  of  the  putative  H4  promoter  sequences,  and,  to  a 
lesser  extent,  minor  repeated  sequences  located  between  the  BamHI  site 
(-1.65  kb)  and  the  EcoRl  site  at  -5.5  kb.   The  fact  that  the  pF0003ml 
cell  line  had  a  significant  proportion  of  its  DNA  tandemly  integrated 
as  shown  in  Figure  4-2,  lane  3,  suggested  that  while  the  minor  repeats 
located  in  its  flanking  sequence  have  contributed  to  increased  copy 
number,  they  have  not  caused  as  much  heterogeneity  in  the  integration 
sites  as  the  Alu  repeat  in  pF0004. 
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Figure  4-3    Specificity  of  pF0004  integration. 

Southern  blot  analysis  was  done  according  to  the  procedures  described 
in  Materials  and  Methods.  A.  Six  pF0004  cell  lines:  pF0004pl,  pF0004ml. 
pFO004ra8,  pF0004mlO,  PF0004mll,  and  pFO004ml9  were  digested  with  either 
Xbal  (lanes  1-6)  or  PstI  (lanes  7-12).  The  blot  was  probed  with  oligo 
labelled  Xbal/HincII  fragment  of  pF0002  as  described  in  Materials  and 
Methods.  The  1 . 6  kb  insert  band  of  pF0004  is  designated  to  the  left  of 
the  figure.  B.  The  histone  3'  probe  was  removed  and  the  blot  was 
reprobed  with  oligolabelled  264  bp  SV40  enhancer  fragment  as  described 
earlier.  The  lanes  in  B  are  identical  to  those  in  A.  The  position  of 
marker  fragments  is  designated  between  A  and  B  in  kilobases.  The 
restriction  enzyme  digest  is  indicated  below  each  lane. 
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The  previous  demonstration  with  the  pFOOOApl  cell  line  that  a 
significant  portion  of  the  insert  was  detectable  and  intact  (Figure  4- 
1,  lane  5)  suggested  that  these  contrasting  results  might  be  the 
consequence  of  specific  integration.   To  determine  whether  the  ability 
to  detect  the  pF0004  insert  fragment  in  genomic  DNA  was  limited  to  the 
single  polyclonal  cell  line  we  had  examined,  we  repeated  the  experiment 
with  the  pF0004pl  cell  line  and  5  monoclonal  cell  lines.   Cell  lines 
with  reasonably  high  copy  number  were  utilized  to  aid  detection  and 
assess  the  effect  of  integration  on  intactness  of  the  flanking 


^„       sequences.   The  results,  presented  in  Figure  4 -3a,  lanes  1-5, 


'''       demonstrate  that  in  every  cell  line  the  1 . 6  kb  Xbal/Xbal  fragment  was 


detectable  and  constituted  a  considerable  portion  of  the  signal  present 
in  each  lane.   If  the  same  pF0004  monoclonal  cell  line  DNAs  were 
digested  with  PstI  and  probed  for  the  presence  of  linear  pF0004 
molecules  (Figure  4- 3a,  lanes  7-12)  there  was  heterogeneity  in 
integration  and  very  little  linear  (5.7  kb)  pF0004  was  detectable.   If 
so  much  of  the  insert  Xbal/Xbal  fragment  was  detectable,  and  so  little 
linear,  there  must  have  been  an  unusual  integration  event  that  occurred 
to  give  both  results.   It  appeared  a  strong  possibility  that  the  Alu 
sequence  in  the  5'  flanking  region  was  a  site  where  specific 
integration  might  occur.   To  test  this  hypothesis  we  digested  three  of 
the  pF0004  monoclonal  cell  line  DNAs  with  Ncol.  This  enzyme  has  two 
sites  of  digestion,  one  at  +280  bp  and  one  in  the  very  distal  100  bps 
of  the  5'  flanking  sequence  (-1.6  kb ,  Figure  4-4b) .  This  digest 
produces  two  DNA  fragments  of  1.9  kb  and  3.8  kb .   The  Alu  sequence  is 
located  in  the  1.9  kb  Ncol  fragment.   It  is  important  to  recall  that 
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Figure  4-4    Southern  blot  analysis  of  pF0004  integration:  Ncol 
digestion  of  genomic  DNA. 

The  analysis  was  done  as  described  in  Materials  and  Methods.  A.  10  pg 
of  DNA  from  monoclonal  lines  pF0004m8 ,  11,  and  19  were  digested  to 
completion  with  Ncol.  The  blot  was  probed  with  the  EcoRI/Xbal  fragment 
from  pF0002.  This  probe  detects  both  the  1.9  kb  and  3 . 8  kb  bands.  B. 
Synopsis  of  the  hybridization  to  the  EcoRI/Xbal  probe.  This  figure 
presumes  that  two  pF0004  molecules  have  integrated  tandemly  head  to 
tail  through  one  of  the  Alu  repeats.  pF0004 ,  when  digested  with  Xbal , 
produces  a  homogeneous  1.6  kb  band  and  2  copies  are  detectable. 
Digestion  with  PstI  produces  a  single  detectable  linear  molecule  of  5.7 
kb  and  in  this  case  one  end  fragment  designated  by  the  dotted  line  and 
the  arrow.  The  data  from  part  A  of  this  figure  supports  the  fact  that 
2,  3.8  kb,  and  1,  1 . 9  kb  fragment  would  be  detectable  from  this  double 
integrate. 
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when  the  pF0004  monoclonal  DNAs  were  digested  with  Xbal ,  uniformly  a 
1.6  kb  fragment  was  detectable  that  indicated  that  the  flanking 
sequences  up  to  -715  bp  were  intact  and  integration  had  not  occurred 
between  the  two  Xbal  sites.   The  only  other  regions  available  for 
integration  were  the  Xbal/EcoRI  fragment  (-750  to  -1750  bp)  and  the 
pUC13  vector  sequences.   Because  all  constructs  share  similar  vector 
sequences  it  was  unlikely  to  be  this  region  that  differentiated  the 
pF0004  construct  from  others  in  integrative  mode.   The  pF0004  Ncol 
digestion  experiment  was  probed  with  the  EcoRI/Xbal  fragment  of  pF0002 
which  detects  both  the  1.9  and  3 . 8  kb  Ncol  fragments.   The  results  of 
the  Ncol  digestion  Southern  blot  are  presented  in  Figure  4-4a.   Several 
exposures  were  scanned  densitometrically  to  determine  the  ratio  of  3.8 
kb  to  1.9  kb  fragment.   In  the  three  cells  lines  the  ratio  of  the  3.8 
kb  and  1 . 9  kb  Ncol  fragments  was  approximately  2:1. 

To  explain  this  and  previous  results  our  current  hypothesis  is  that 
the  pFO004  plasmid  DNA  integrated  through  the  Alu  sequence  in  no  more 
that  two  or  three  copies  per  integration  site.   This  hypothesis 
explains,  as  diagrammed  in  Figure  4-4b,  that  when  two  copies  of  pF0004 
are  integrated  through  the  Alu  sequence:  1)  the  Xbal/Xbal  fragments 
(there  are  2)  are  both  detectable,  2)  only  one  of  the  two  integrated 
constructs  is  detectable  when  the  PstI  digestion  is  done,  and  3)  the 
Ncol  digestion  produces  two  3.8  kb  and  one  1 . 9  kb  fragments  as  seen 
experimently  (Figure  4-4a).   Althoughnot  conclusive  it  suggests  some 
preferential  integration  via  the  Alu  sequence.   This  specificity  of 
integration  through  the  Alu  repeat  accounts  for  the  heterogeneity  in 
integration  sites  observed.   Previously  it  was  thought  that  integration 
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of  plasmid  DNA,  in  a  single  eukaryotic  cell,  occurred  first  at  the 
cointegrate  stage,  then  at  a  single  site  in  the  chromosome  (Perucho  et 
al.,  1980).   This  is  plausible  because  usually  there  is  relatively 
little  homology  between  the  transfected  DNA  molecules  and  the  cellular 
DNA.   However,  it  is  apparent  from  our  results  that  repeated  sequences, 
such  as  the  Alu  repeat,  which  are  well  conserved  from  species  to 
species  may  mediate  specific  and  higher  levels  of  integration  than 
normally  possible. 

Location  of  pSV2neo  Plasmid  Sequences  ' 
The  second  point  to  be  addressed  in  these  experiments  was  whether 
the  pSV2neo  plasmid  was  located  in  the  proximity  of  the  human  histone 
H4  gene  constructs.   In  order  to  create  the  cell  lines  we  have  used  in 
this  study,  it  was  necessary  to  cotransfect  with  the  plasmid  pSV2neo. 
Our  primary  concern  was  to  establish  to  what  extent  the  SV40  enhancer 
might  associate  with  the  histone  promoter  deletion  constructs  and 
affect  the  expression  of  the  H4  constructs  in  a  cis  manner.   Since 
there  is  similarity  between  the  pBR322  portions  of  these  various 
plasmids  we  investigated  the  possibility  that  pSV2neo  and  histone 
deletion  plasmids  were  located  adjacent  to  each  other. 

An  early  observation  with  regard  to  this  problem  was  that 
constructs  such  as  J67  and  other  short  deletion  constructs  of  the  H4 
promoter  demonstrated  little  or  no  transcription  when  integrated 
stably.   This  result,  probably  more  than  any  other,  demonstrated  that 
the  pSV2neo  plasmid  had  little  or  no  influence  on  expression  of  the 
cotransfected  histone  plasmids.   It  was  reasonable  to  suppose  that  the 
integration  of  the  human  H4  histone  genes  occurred  at  a  sufficient 
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distance  from  the  influence  of  any  endogenous  strong  promoter  effects. 
It  was  also  unlikely  that  the  pSV2neo  cotransfected  molecules  had  any 
substantial  effect  on  the  expression  of  the  transfected  H4  histone 
genes . 

To  initially  address  the  location  with  respect  to  human  histone  H4 
sequences  of  the  pSV2neo  plasmid  in  the  monoclonal  cell  lines,  we 
reprobed  monoclonal  cell  line  Southern  blots  shown  in  Figure  4-2  with 
an  oligo- labelled  EcoRI/EcoRI  fragment  that  contained  the  entire  SV40 
enhancer  sequence.   A  pUC8  clone  of  the  264  bp  EcoRI/EcoRI  fragment  was 
kindly  provided  to  me  by  Gerard  Zarabetti  of  our  laboratory  and  contains 
both  72  bp  repeats  (originally  derived  from  pDG014,  a  gift  of  Dr. 
Sherman  Weissman) .   Figure  4-2  is  a  composite  of  identical  lanes 
probed  with  either  the  histone  3'  probe  as  detailed  earlier  or  the  SV40 
enhancer  fragment.   We  felt  it  would  be  easier  for  comparison  if  the 
lanes  were  placed  adjacent  to  each  other  instead  of  on  separate 
figures.   In  Figure  4-2  lanes  5  (pFOOOSml) ,  8  (pFO004mll) ,  12 
(pFO005m5),  and  16  (pFOlOBA)  are  identical  to  the  adjacent  lanes  4,  7, 
11,  and  15.   A  PstI  digest  of  pSV2neo  produces  three  fragments  and  the 
SV40  probe  detects  the  2 . 3  kb  fragment  that  contains  part  of  the 
neomycin  resistance  gene,  the  SV40  promoter/enhancer,  and  some  pBR322 
sequence.   When  genomic  DNA  is  digested  with  BamHI  the  pSV2neo  DNA  is 
linearized,  and  if  tandemly  integrated,  a  5.5  kb  band  should  be 
detectable.   In  Figure  4-2,  lane  5,  the  pF0003ml  DNA  cut  with  PstI 
demonstrated  a  prominent  2 . 3  kb  band  as  expected,  but  also  has  bands  in 
the  region  of  the  histone  signal  detected  previously  (10  kb)  in  lanes  3 
and  4.   The  ability  to  detect  a  substantial  amount  of  both  the  10.2  kb 
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pF0003  DNA  and  the  2 . 3  kb  pSV2neo  fragment  suggests  that  there  is  not 
a  substantial  mixing  of  the  two  molecules  in  the  pF0003  integration 
site.   The  smaller  fragments  detected  in  lane  4  probably  represent 
"end"  fragments  of  each  integration  event.   An  end  fragment  is  detected 
because  it  is  the  most  distal  plasmid  sequence  on  either  side  of  the 
integration  event.   In  the  case  of  pF0003ml ,  the  DNA  is  cut  with  Pstl. 
The  pF0003  DNA  molecules  integrated  at  each  end  of  the  tandem  array 
will  be  subject  to  cutting  internally  with  Pstl  once,  and  at  some 
unknown  distance  into  the  cellular  DNA  at  the  next  available  Pstl  site. 
Since  this  next  Pstl  site  is  of  an  undetermined  location  on  both  ends 
of  the  integration  event,  for  every  integration  site  there  will  usually 
be  two  end  fragments  of  unknown  length  detectable.   The  number  of  end 
fragments  can  indicate  the  number  of  different  sites  into  which  the 
construct  has  integrated.   pFO003ml  has  10  or  more  fragments  in 
addition  to  the  main  band  at  10  kb .   This  could  be  interpreted  as 
reflecting  5  integration  sites  in  this  monoclonal  cell  line  or  perhaps 
the  inclusion  of  pSV2neo  between  tanderaly  repeated  pF0003  molecules 
causes  periodic  interruptions. 

The  digestion  of  pFOOOAmll  with  Pstl  (Figure  4-2,  lane  8)  also 
demonstrates  that  the  2 . 3  kb  pSV2neo  fragment  is  detectable  and 
constitutes  a  considerable  portion  of  the  signal  in  lane  8.   A 
comparison  of  this  lane  hybridized  to  histone  3'  sequences  (lane  7)  and 
hybridized  to  the  SV40  enhancer  fragment  demonstrates  that  very  few  of 
the  bands  detectable  with  the  histone  sequence  probe  are  also  detected 
with  the  SV40  enhancer  probe.   This  lack  of  congruity  pointed  to  some 
separation  of  the  pSV2neo  and  pFO  series  plasmids  upon  integration. 
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The  construct,  pF0005 ,  when  digested  with  BamHI  in  Figure  4-2,  lane  9, 
yields  a  4.3  kb  linear  fragment  when  hybridized  to  histone  3'  flanking 
sequences.   When  hybridized  to  the  SV40  fragment,  a  5.5  kb  band 
corresponding  to  linear  pSV2neo  is  detected  (lane  10).   The  ability  to 
detect  a  majority  of  the  pSV2neo  DNA  as  a  linear  molecule  confirms  the 
idea  that  in  many  instances  the  pSV2neo  plasmid  has  integrated 
primarily  in  a  site  apart  from  the  histone  constructs.   The  band  above 
the  4.3  kb  (lane  9)  and  below  5 . 5  kb  (lane  10)  apparently  contains  both 
histone  and  SV40  sequences.   The  construct  pFOlOSAmlO,  when  digested 
with  either  BamHI  (linearizes  construct,  6.2  kb)  (lane  13)  or  PstI  (2.2 
kb  fragment)  (lane  15),  and  probed  with  the  histone  3'  sequences 
resulted  in  the  detection  of  many  fragments.   When  hybridized  with  the 
SV40  enhancer  fragment,  there  is  a  5.5  kb  band  in  lane  14  and  a  2 . 3  kb 
band  in  lane  16,  along  with  several  additional  bands,  both  larger  and 
smaller.   In  some  instances  there  is  identity  between  the  fragments 
detected  by  the  two  probes,  so  it  is  possible  they  are  located  in  close 
proximity  or  linked  on  restriction  fragments. 

Still,  at  this  point  it  was  difficult  to  determine  the 
relationship  between  the  two  transfected  plasmids,  histone  and  pSV2neo. 
It  was  apparent  that  in  some  cases  there  was  a  reason  to  believe  that   . 
the  two  plasmids  were  not  completely  mixed  during  the  integration 
events.   To  determine  the  relationship  in  a  different  way,  we  reprobed 
the  blot  in  Figure  4- 3a  with  the  SV40  enhancer  fragment  after  removal 
of  the  histone  probe.   The  idea  in  this  experiment  was  that  the  pSV2neo        ^  ■ 
plasmid  contains  no  Xbal  restriction  sites.   Therefore,  the  digestion  .  .i 

with  Xbal,  which  released  greater  than  90%  of  the  pF0004  sequences  as  a         J 
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1.6  kb  band,  should  determine  whether  there  was  any  mixing  between  the 
pSV2neo  and  pF0004  plasmid  upon  integration.   The  presence  of  a  5.5  kb 
or  larger  band  would  be  indicative  of  mixing  and  release  upon  Xbal 
digestion.   As  can  be  seen  in  Figure  4-3b  the  Xbal  digested  pF0004 
monoclonal  cell  lines  (lanes  1-6)  hybridized  to  the  SV40  enhancer 
fragment  in  a  diffuse  manner  and  primarily  in  the  upper  region  of  the 
blot  that  was  indicative  of  large  DNA  molecules.   A  few  bands  were 
detectable  in  the  pF0004ml9  cell  line  and  it  should  be  noted  that  this 
cell  line  has  a  very  high  copy  number  and  a  great  deal  of  heterogeneous 
integration.   The  same  pF0004  monoclonal  cell  lines  when  digested  with 
PstI  (Figure  4- 3b,  lanes  7-12)  demonstrated  that  the  pSV2neo  sequences 
are  present  and  detectable  as  a  2.3  kb  band.   The  pF0004  monoclonal 
cell  lines  digested  with  Xbal  and  probed  with  the  SV40  sequences 
suggest  that  the  pSV2neo  plasmid  DNA  is  not  interspersed  in  the 
integrated  pF0004  plasmid  DNAs .   If  the  pSV2neo  plasmid  had  been 
released  by  Xbal  digestion  we  would  have  expected  a  strong  band(s)  in 
the  high  molecular  weight  region  of  the  blot.   The  diffuse 
hybridization  throughout  the  lane  is  somewhat  confusing  and 
unfortunately  a  C127  DNA  control  was  not  included  on  this  gel.   There 
is  the  possibility  that  the  DNA  fragments  that  contained  the  pSV2neo   . 
plasmid  molecules  were  very  large  and  did  not  transfer  well  from  the 
gel. 

Given  the  facts  presented  and  known  about  enhancers,  particularly 
the  SV40  enhancer,  it  seems  reasonable  to  conclude  that  this  potent 
enhancer  has  little  or  no  effect  on  the  human  H4  his  tone  sequence 
integrated  in  these  mouse  cells.   Because  of  the  intensity  of  the 
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pSV2neo  bands  (2.3  kb)  detectable  in  Figure  4-2  and  Figure  4-3  it  was 
likely  that  the  copy  number  of  the  pSV2neo  plasmid  in  these  cell  lines 
was  very  high.   This  was  certainly  the  result  of  integration  and 
amplification  under  the  selective  pressure  of  G418.   Because  of  the 
selective  pressure  under  which  these  cell  lines  were  grown  it  was 
impossible  to  determine  the  absolute  pSV2neo  copy  number  originally 
present  in  the  cell. 

Compatibility  of  Mouse  and  Human  Regulatory  Proteins  and  Sequences 
Examination  of  the  copy  number  data  presented  in  Table  3-2  revealed 
that  as  the  copy  number  of  a  cell  line  increased  the  expression/copy 
decreased.   This  is  graphically  detailed  in  Figure  4-5  where  several 
cell  lines  have  been  compared  to  one  another  for  this  effect.   The 
obvious  trend  was  typified  by  pFOOOS.   When  cell  lines  with  fewer  than 
5  copies  are  plotted  the  expression/copy  was  high  (0.3),  but  when  copy 
number  rose  above  5  the  expression/copy  ratio  decreased  dramatically. 
Although  the  expression/copy  ratio  for  the  other  constructs  presented 
was  generally  lower  than  for  pFOOOS ,  the  decrease  with  increased  copy 
number  was  still  apparent.   This  effect  presented  several  problems:   1) 
is  it  then  appropriate  to  analyze  only  the  low  copy  number  cell  lines 
for  differences  from  construct  to  construct?  and  2)  does  this  indicate  • 
that  the  human  and  mouse  H4  genes  are  in  competition  with  each  other 
for  necessary  transcription  factors?   In  chapter  3  we  alluded  to  the 
fact  that  there  was  a  competition  phenomenon.   At  that  point  we 
interpreted  the  pFOlOSA  and  pF0005  data  in  the  context  of  copy  number. 

To  determine  whether  these  concerns  were  valid,  we  performed  an 
analysis  of  the  protein/DNA  interactions  in  the  5'  promoter  sequences 
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Figure  4-5    Effect  of  cell  line  copy  number  on  the  expression  of  the 
human  H4  histone  gene. 

A  plot  of  average  expression/copy  versus  the  cell  line  copy  number. 
Data  from  Table  3-2  was  averaged  for  K8 ,  pFOlOSA,  pF0005 ,  pF0002 ,  and 
pF0003.  The  average  expression  from  all  cell  lines  in  a  group  with  the 
same  copy  number  are  presented  as  single  points.  Most  points  are 
representative  of  the  value  for  a  single  monoclone  and  not  averaged 
with  others.  The  legend  in  the  figure  designates  each  curve. 
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of  the  F0108  H4  histone  gene.   If  promoter  competition  for 
transcription  factors  occurred  then  we  felt  it  might  be  possible  to 
detect  the  effect  of  high  copy  number  on  the  binding  of  transcription 
factors . 

In  collaboration  with  Dr.  Urs  Pauli,  of  our  laboratory,  we 
characterized  the  protein/DNA  interactions  in  the  proximal  promoter 
region  of  three  monoclonal  cell  lines  containing  the  construct  pF0003. 
We  were  interested  to  know  whether  Site  I  and  Site  II  were  present  in 
the  proximal  promoter  of  the  human  H4  histone  gene  when  integrated  in  a 
mouse  cell  and  if  the  protein/DNA  contact  points  were  the  same.   The 
pF0003  cell  lines  were  chosen  for  several  reasons.   They  had  a  wide 
range  of  copy  number  available  and  we  felt  that  the  extensive  5' 
flanking  region  (-6.5  kb)  was  more  likely  to  assume  a  chromatin 
structure  like  that  found  in  a  human  cell.   Cell  lines  were  grown  until 
80-90%  confluent  and  treated  with  DMS  in  vivo  as  described  in 
Materials  and  Methods.   Genomic  DNA  from  each  cell  line  was  prepared, 
digested  with  Hindi,  electrophoresed,  and  blotted  as  described  in 
Materials  and  Methods.  The  filter  with  immobilized  DNA  was  then 
hybridized  with  the  5'  Hindi  upper  strand  probe  (Figure  4-6a).   This 
probe  was  used  because  the  upper  strand  of  the  DNA  contained  13  Gs 
strongly  protected  from  DMS  treatment  whereas  the  lower  strand 
contained  only  3  minor  protections  (Pauli  et  al . ,  1987).   All  the  G 
residues  that  exhibit  protection  are  noted  on  the  side  of  Figure  4-6. 
The  boundaries  of  Site  I  and  Site  II  are  denoted  to  the  right  of  Figure 
4-6b.   These  were  determined  by  Pauli  et  al .  (1987)  by  DNasel 
protection.   Therefore,  we  were  able  to  easily  detect  any  differences 


Figure  4-6  Genomic  sequencing  analysis:  protein/DNA  interactions 
in  the  proximal  promoter  of  the  F0108  H4  his tone  gene 
stably  integrated  into  mouse  C127  cells. 

As  described  in  Materials  and  Methods  the  genomic  DNA  from  several 
different  monoclonal  cell  lines  of  pF0003  was  treated  with  DMS  in  vivo. 
The  DNA  was  then  purified,  treated  with  piperidine,  restricted  with 
Hindi,  electrophoresed,  blotted  and  probed  with  the  upstream  5'  Hindi 
probe.   A.  Schematic  diagram  of  the  proximal  region  of  the  F0108  H4 
histone  gene.  The  single  strand  (Hindi)  probe  that  was  utilized  in 
these  experiments  is  designated  with  the  large  arrow.  Restriction 
enzyme  sites  are  denoted  as  EcoRI ,  E;  Hindi,  He;  Hindlll,  H;  Ncol ,  N. 
The  large  box  is  the  H4  coding  and  leader  sequence.  Both  Site  I  and 
Site  II  are  designated  above  the  diagram.  B.  Genomic  sequencing 
analysis  of  protein  contact  points  in  Site  I  and  Site  II  of  the  human 
H4  proximal  promoter  region.  Lanes:  1,  control  HeLa  DNA,  purified, 
deproteinized,  and  then  treated  with  DMS.  2,  HeLa  DNA  that  was  treated 
in  vivo  to  demonstrate  the  positions  of  Site  I  (-123  to  -89  bp)  and 
site  II  (-63  to  -23  bp) .  At  the  left,  the  small  arrows  indicate  the 
protein/DNA  interactions  as  detected  by  DMS  methylation  interference 
(Pauli  et  al.,  1987).  Lane  3,  pFO003m5  cell  line  DNA  (copy  number  =  13) 
treated  in  vivo  with  DMS.  The  three  G  residues  at  approximately  -98  to 
-100  bp  are  protected  and  denoted  on  the  figure  with  an  arrow.  Lane  4, 
control  deproteinized  pFO003m5  DNA.  Lane  5,  HeLa  control  DNA, 
deproteinized  and  DMS  treated.  Lane  6,  pFO003M6  (copy  number  =  20)  DNA 
treated  in  vivo  with  DMS.   The  protected  G  residues  at  -98  to  -100  bp 
are  noted  with  an  arrow  on  the  figure.   Lane  7,  pF0003Ml  (copy  number  = 
140)   DNA  treated  in  vivo  with  DMS.   The  G  residues  are  not  protected 
at  -98  bp.   At  no  time  was  there  any  detectable  protein  DNA  interaction 
in  pFO003m5,  m6 ,  or  ml  at  Site  II. or  the  distal  part  of  Site  I.  Lane  8, 
pF0003  plasmid  DNA  treated  with  DMS  as  a  control  for  the  G  residue 
sequencing  pattern.  The  only  detectable  protein  binding  occurs  in  Site 
I  of  pFO003m5  and  m6  at  the  putative  Spl  site. 
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in  the  protein/DNA  interactions  in  the  heterologous  mouse  system.   The 
results  presented  in  Figure  4-6b,  lane  3,  suggested  that  in  a  cell  line 
with  low  copy  number,  pFO003m5  (13  copies),  a  significant  portion 
(approximately  70%)  of  the  genes  had  protein  bound  to  the  proximal  side 
of  Site  I  (the  putative  Spl  site  G  residues  -98  to  -100  protected  in 
vivo),  but  there  was  apparently  no  protein  bound  to  site  II.   When  the 
copy  number  of  the  cell  line  increased  to  20,  pFO003m6 ,  there  was  still 
protein/DNA  interaction  detectable  at  Site  I,  but  not  at  Site  II  (lane 
7).   Finally  when  140  copies  of  the  human  histone  gene  were  present, 
pF0003ml,  there  was  no  detectable  protein  interaction  at  Site  I  or  Site 
II  (lane  7) .  As  a  control  for  the  presence  and  location  of  Site  I  and 
Site  II,  synchronized  HeLa  cells,  early  in  S  phase,  were  treated  with 
DMS  at  the  same  time  and  subjected  to  the  same  protocol  as  the  pF0003 
cell  lines  (lane  2).   pF0003  plasmid  DNA  and  deproteinized  HeLa  DNA 
were  DMS  treated  as  a  control  (lanes  8  and  1  respectively)  for  the 
expected  sequence  pattern  of  the  G  residues. 

The  results  substantiated  the  cell  line  expression  data  that  a 
limiting  factor(s)  was  necessary  for  the  transcription  of  histone 
genes.   The  results  also  support  the  contention  that  the  mouse  and 
human  transcriptional  proteins  are  not  necessarily  identical.   Previous 
studies,  including  this  work,  have  demonstrated  that  the  mouse  cell  is 
able  to  correctly  express  introduced  human  histone  genes  (Green  et  al . , 
1986,  Capasso  and  Heintz,  1985).   In  many  other  respects  the  mouse  cell 
is  capable  of  the   regulation  of  human  histone  mRNA  in  a  manner 
identical  to  that  of  the  human  cell.   We  have  demonstrated  that  the 
processed  3'  ends  of  the  human  histone  H4  mRNA  are  identical  in  mouse 
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and  human  cells  and  that  the  transcription  initiation  sites  are  also 
identical  (data  not  shown).   However,  our  failure  to  detect  protein 
bound  to  the  distal  side  of  Site  I  and  to  all  of  Site  II,  even  at  low 
copy  number,  indicated  that  there  were  differences  in  the  factors  that 
bind  there.   Confirmation  of  their  existence  and  binding  in  vitro  has 
been  demonstrated  by  van  Wijnen  et  al .  (1988,  and  personal 
communication) .  Perhaps  there  are  subtle  protein  sequence  variations 
that  preclude  detection  with  genomic  sequencing.   Previously,  van 
Wijnen  et  al .  (1987)  demonstrated  that  there  were  factors  that  bound  to 
the  region  of  the  H4  promoter  from  -210  to  -410  bp ,  however  these 
proteins  were  not  detected  in  vivo  by  genomic  sequencing.   Either  these 
protein/DNA  complexes  were  artifacts  of  the  in  vitro  assay  system 
utilized  or  some  protein/DNA  interactions  are  simply  not  detectable 
with  genomic  sequencing.   We  should  note  that  Dr.  Pauli  examined  the 
region  from  -210  to  -410  bp  with  both  DNasel  and  DMS  protection. 
DNasel  is  likely  to  detect  a  majority  of  the  interactions,  whereas  DMS 
might  not  pick  up  every  interaction  (Dr.  Pauli,  personal 
communication) . 

We  reanalyzed  the  copy  number  and  expression  data  in  light  of  the 
genomic  sequencing  results  and  found  that  in  most,  but  not  all  cases, 
the  human  and  mouse  absolute  densitometry  signals  inverted  as  the  copy 
number  increased.  This  is  graphically  presented  in  Figure  4-7  for 
several  cell  lines  including  pF0003.   We  calculated  the  percent  of  the 
total  SI  nuclease  protected  fragment  signal  measured  densitometrically 
(mouse  H4  +  human  H4)  that  was  representative  of  the  mouse  H4  gene  and 
plotted  this  versus  the  copy  number  of  each  monoclonal  cell  line.   It 


165 


Effect  of  Copy  number  on  Mouse  H4  Expression 
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Figure  4-7    Effect  of  Human  H4  gene  copy  number  on  Mouse  H4  gene 
expression. 

SI  nuclease  protection  data  for  several  cell  lines  was  analyzed  to 
determine  the  effect  of  the  human  H4  gene  on  the  expression  of  the 
mouse  H4  gene.  The  human  and  mouse  SI  nuclease  assay  densitometry 
values  were  totaled  and  the  percent  of  the  total  signal  that  was  mouse 
was  plotted  versus  the  copy  number  of  the  human  H4  in  each  cell  line. 
Data  from  pF0003 ,  pFOOOS ,  and  PFOIOSA  are  shown  to  illustrate  the  point 
that  as  human  H4  copy  number  increases  the  expression  of  the  mouse  gene 
decreases. 
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Reassessment  of  human  H4  histone  gene  expression: 
copy  number  data. 


low 


The  same  cell  lines  that  were  depicted  in  Figure  3-17  are  shown  here 
Only  data  from  low  copy  number  cell  lines  has  been  included  This  in 

?^l'o^^''^"^^t'  '°  ^^''  ^^^"^   ^°  copies/cell.  Expression/copy  number 
IS  plotted  with  the  standard  deviation  of  the  mean  as  a  one  way  error 
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was  obvious  that  as  the  human  H4  copy  number  increased  the  percent  of 
total  mouse  SI  signal  decreased  proportionally.   This  result  requires 
some  qualifications.   We  had  expected  that  as  the  human  H4  copy  number 
increased  the  expression  would  also  increase.   If  there  was  no 
competition  between  the  two  sets  of  genes  than  the  mouse  signal  should 
have  been  unaffected  and  remained  stable  since  the  mouse  copy 
numberdoes  not  change.   If  that  logic  is  followed  an  additional  step 
than  the  mouse  signal  should  have  decreased  with  human  copy  number  when 
measured  as  a  ratio.  However,  the  human  H4  expression  did  not  continue 
to  rise  with  the  human  H4  copy  number.   Effectively,  the  increase  in 
the  human  H4  gene  copy  number  appears  to  have  lowered  the  mouse  gene 
expression  and  therefore  artificially  raised  the  human  gene 
expression.   The  result  of  this  phenomenon  is  that  the  original  human 
H4/mouse  H4  ratio  that  was  calculated  is  certainly  inaccurate  in  high 
copy  number  cell  lines.   We  have  noticed  that  in  very  high  copy  number 
cell  lines,  such  as  pFO004ml,  both  the  human  and  mouse  H4  genes  are 
expressed  at  low  levels  (Figure  3-27).  This  is  probably  the  result  of 
factor  distribution  between  the  possible  transcription  units  in  such  a 
manner  that  none  of  the  genes  has  a  full  complement  of  proteins 
necessary  for  expression.   Our  reassessment  of  the  expression  data  is  " 
presented  Figure  4-8,  and  only  incorporates  data  from  each  cell  line  in 
which  the  competition  phenomenon  (generally  low  copy  number  cell  lines) 
was  not  readily  apparent.   The  expression/copy  was  plotted  as  before  in 
Figure  3-17.   The  statistical  differences  between  constructs  that  were 
detailed  earlier  is  still  valid  for  this  part  of  the  data. 


CHAPTER  5 
DISCUSSION  AND  CONCLUSIONS 
Our  studies  over  the  last  several  years  have  contributed  to  the 
general  understanding  of  histone  gene  expression  and  of  the  expression 
of  human  genes  in  a  heterologous  system.   The  histone  genes  have  been 
studied  intensely  for  decades  and  only  now  are  beginning  to  be 
understood.   From  the  work  we  have  presented  here  and  the  work  done  by 
others  (Hanley  et  al.,  1985;  Sierra  et  al.,  1983;  Dailey  et  al . ,  1986; 
van  Wijnen  et  al . ,  1987)  it  is  clear  that  the  histone  H4  promoter  is 
composed  of  several  discrete  DNA  sequence  elements,  including  the  TATA 
box,  CAAT  box,  Spl  site  (5 ' -GGCGGG-3' ) ,  and  GGTCC  element.   We  have 
also  demonstrated  that  more  distal  sequences  may  have  both  a  positive 
and  negative  effect  on  the  transcriptional  regulation  of  the  F0108 
human  H4  histone  gene. 

We  initially  wanted  to  demonstrate  what  sequences  were  sufficient 
for  cap  site  initiation  of  transcription  in  vivo.   Previously,  Sierra 
et  al.  (1983)  had  demonstrated  with  a  series  of  Bal31  deletions,  that 
sequences  contained  in  the  construct  J67  (-47  bp) ,  including  the  TATA 
box  (-30  bp)  and  GGTCC  element  (-47  bp) ,  were  sufficient  for  correct 
initiation  of  transcription  in  vitro.   In  order  to  ascertain  whether 
these  sequences  were  sufficient  in  vivo  we  constructed  a  series  of 
polyclonal  cell  lines  in  mouse  C127  cells  with  the  Bal  31  deletion 
constructs  as  described  in  Chapter  3.   Transcription  from  each 
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construct  was  measured  by  SI  nuclease  analysis  and  the  expression  level 
per  copy  determined.  Six  polyclonal  cell  lines  of  the  construct  J67  (- 
47  bp)  were  prepared  and,  although  all  cell  lines  contained  detectable 
copies  of  the  J67  construct  as  determined  by  Southern  blot  analysis, 
none  of  the  cell  lines  initiated  human  H4  transcription  correctly. 
An  examination  of  the  in  vivo  protein/DNA  interactions  within  the 
proximal  promoter  region  of  the  human  H4  gene  (Pauli  et  al.,  1987)  had 
previously  revealed  two  sites  of  interaction,  Site  I  (-124  bp  to  -89 
bp)  and  Site  II  (-64  bp  to  -23  bp) .   We  believe  that  the  lack  of 
correct  in  vivo  transcription  initiation  from  the  J67  construct  (-47 
bp)  is  the  result  of  the  deletion  of  Site  I  and  the  distal  half  of  Site 
II.  Even  though  the  GGTCC  element  (-47  bp)  and  the  TATA  box  (-32  bp) 
are  still  present  in  the  J67  construct,  they  are  apparently 
insufficient  for  site  specific  transcription  initiation  in  vivo.   The 
GGTCC  element  that  remains  in  the  J67  construct  is  probably  incapable 
of  binding  its  respective  protein.   Pauli  et  al.  (1987)  have 
demonstrated  that  the  in  vivo  factor  interaction  with  the  GGTCC  element 
occurs  symmetrically  at  three  G  residues  on  both  DNA  strands.   The  J67 
deletion  disrupts  the  symmetry  of  this  binding  through  deletion  of  the 
distal  G  residue  on  the  bottom  strand.   Additionally,  the 
CAAT. .2bp. .GGTCC  motif  that  is  well  conserved  in  many  H4  histone  genes 
(Wells,  1986)  is  disrupted  by  the  J67  deletion,  suggesting  that  it  may 
also  be  important  for  transcriptional  regulation.   Our  results  suggest 
that  multiple  transcription  factors  are  required  for  H4  transcription 
initiation  and,  in  support  of  this  hypothesis,  van  Wijnen  et  al .  (1987, 
1988)  have  demonstrated  specific  protein  binding  regions  within  Site  I 
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and  Site  II  of  the  F0108  H4  histone  gene  in  vitro.  We  were  able  to 
demonstrate  the  necessity  for  all  of  the  Site  II  protein/DNA 
interactions  since  correct  initiation  of  transcription  was  observed 
with  the  construct  J56  (-73  bp)  that  includes  all  of  Site  II.   In 
contrast  to  the  in  vitro  transcription  results  of  Sierra  et  al. 
(1983),  we  have  demonstrated  that  sequences  between  -47  and  -73  bp, 
included  in  the  construct  J56  (-73  bp) ,  are  required  for  H4  histone 
transcription  initiation  in  vivo . 

The  protein/DNA  interactions  at  Site  I  (-124  bp  to  -89  bp)  were 
shown  by  Pauli  et  al.  (1987)  to  overlap  a  putative  Spl  binding  site 
(5' -GGGGCGGGGC-3')  as  described  by  Briggs  et  al .  (1986).   We  were 
interested  to  know  whether  this  Spl  site  was  functional  and  contributed 
to  the  transcriptional  regulation  of  the  F0108  human  H4  histone  gene. 
A  cell  line  that  contained  the  Bal  31  deletion  construct  J50  (-100  bp) 
was  prepared  and  assayed  by  SI  nuclease  analysis.   With  this  cell  line 
we  demonstrated  that  the  additional  sequences  between  -73  and  -100  bp , 
included  in  the  construct  J50,  increased  the  level  of  in  vivo 
transcription  at  least  10  fold  above  the  construct  J56  (-73  bp) .   Our 
result  is  consistent  with  interaction  of  Spl  or  an  Spl-like  protein 
with  this  sequence  and  that  this  is  responsible  for  the  increase  in 
transcription  we  have  noted.   Additionally,  we  have  been  able  to 
demonstrate  with  genomic  sequencing  (Church  and  Gilbert,  1984)  that 
there  is  a  factor  in  mouse  C127  cells  that  binds  to  the  Spl  recognition 
sequence  in  vivo,  although  we  cannot  conclude  that  it  is  indeed  Spl. 
Taken  together,  our  results,  and  those  of  Pauli  et  al .  (1987)  and  van 
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Wijnen  et  al .  (1987,  1988)  implicate  Spl  as  a  positive  transcription 
factor  in  the  regulation  of  this  human  H4  histone  gene. 

Because  the  5'  flanking  region  of  the  F0108  H4  histone  gene  is  very- 
extensive,  we  characterized  the  contribution  of  more  distal  5'  flanking 
sequences  to  the  transcriptional  regulation  of  this  H4  histone  gene  in 
vivo.   We  first  established  that  when  all  of  Site  I  and  Site  II  were 
present  in  cell  lines  that  contained  the  construct  K8  (-155  bp)  no 
further  increase  in  the  level  of  transcription  was  detected.   Extension 
of  the  promoter  sequences  to  -215  bp  in  the  construct  pF0108A 
demonstrated  that  in  vivo  sequences  from  Site  I  (-124  bp)  to  -215  bp 
did  not  influence  the  level  of  transcription.   These  results  were 
determined  from  experiments  with  both  polyclonal  and  monoclonal  cell 
lines  of  K8  and  pF0108A. 

The  inclusion  of  sequences  up  to  -417  bp  in  the  construct  pF0005 
resulted  in  a  2- fold  increase  in  the  level  of  transcription  above  that 
demonstrated  with  the  pFOl08A  construct  cell  lines.   Previous  analysis 
of  this  region  by  Pauli  et  al .  (1987)  had  revealed  no  detectable  in 
vivo  protein/DNA  interactions.   In  order  to  determine  more  precisely 
the  location  of  the  positive  transcription  element,  two  deletions  of 
the  pF0005  construct  were  prepared  with  Exonuclease  III  and  assayed  in 
monoclonal  cell  lines  and  short  term  transient  expression  experiments. 
The  deletions  have  been  denoted  pFO005BSdel2-6  (-285  bp)  and 
pFO005BSdel2-10  (-335  bp) .   Our  results  from  the  monoclonal  cell  lines 
constructed  support  the  idea  that  the  positive  transcription  element  in 
pF0005  is  located  in  the  sequences  from  -215  bp  (pF0108A)  to  -335  bp 
(pF0005BSdel  2-10).   Comparison  of  the  level  of  transcription  from  the 
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pFOOOSBS  construct  (-417  bp)  and  pFO005BSdel2-10  (-335  bp)  demonstrated 
that  the  deletion  (-  80  bp)  had  not  affected  the  level  of 
transcription.   Only  a  single  monoclonal  cell  line  was  obtained  with 
the  construct  pFO005BSdel2-6  (-285  bp)  and  the  level  of  transcription 
was  shown  to  be  lower  than  pFOOOSBS .   Because  of  the  lack  of 
appropriate  cell  lines  we  were  unable  to  assess  the  effect  of  this 
deletion  on  the  level  of  H4  histone  gene  transcription.   Ken  Wright  of 
our  laboratory  has  demonstrated  similar  transcription  results  in  an  in 
vitro  transcription  system  with  these  constructs  (personal 
communication).   Preliminary  analysis  of  the  sequences  from  -215  bp  to 
-335  bp  suggested  that  secondary  structure  might  be  responsible  for  the 
function  of  this  region.   There  are  two  possible  inverted  repeats 
within  the  region  that  may  form  stable  stem  and  loop  structures. 
Stable  cell  lines  and  short  term  transient  expression  experiments  with 
the  construct  pFOOOl  (-3.3  kb,  internal  deletion  -586  bp  to  -215  bp) 
also  support  the  contention  that  the  sequences  from  -417  bp  to  -215  bp 
(pF0005)  contain  a  positive  transcription  element.   In  polyclonal  and 
monoclonal  cell  lines  the  pFOOOl  construct  is  expressed  at  a 
significantly  lower  level  than  pF0005  and  pF0108A.  This  result  was 
duplicated  in  the  transient  expression  experiments  described  in  chapter 

3. 

We  examined  even  more  distal  sequences  and  demonstrated  that  in 
stable  monoclonal  cell  lines  and  short  term  transient  expression  assays 
the  construct  pF0007  (-586  bp)  exhibits  the  same  level  of  expression 
as  pF0005  (-417  bp) .  If  sequences  extending  to  -1065  bp  were  included 
(pF0002) ,  there  was  a  significant  (2-3  fold)  decrease  in  the  level  of 
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transcription.  Based  on  this  observation  we  proposed  that  there  was  a 
negative  regulatory  element  between  -586  bp  and  -1065  bp  in  the  histone 
H4  promoter.   The  region  was  sequenced  by  the  method  of  Maxam  and 
Gilbert  (1980)  in  order  help  to  identify  any  possible  sequence  elements 
that  might  be  responsible  for  the  decrease  in  transcription.   Our 
analysis  found  two  candidate  sequences,  Box  1  (-710  bp ,  5' -TCCCCTCTCAG- 
3')  and  Box  2  (-580  bp,  5 ' -ATTCTCCTGT-3' ) ,  with  homology  to  negative 
regulatory  elements  as  described  by  Baniahmad  et  al.  (1987)  for  the 
chicken  lysozyme  gene.   To  determine  if  these  elements  had  any 
functionality  we  constructed  two  deletions  in  the  460  bp  BamHI/EcoRI 
fragment  of  pF0002  designated  pF0002Dl  (-920  bp)  and  pFO002E9  (-730 
bp) .  These  constructs  were  assayed  in  comparison  to  pF0002 ,  pF0007 , 
pFOOOS,  pFOOOl,  and  pF0108A  for  expression  of  the  F0108  H4  histone  gene 
in  transiently  transfected  G127  and  Ltk"  mouse  cells.   In  both  cell 
types  we  demonstrated  that  the  sequences  we  had  proposed  based  on 
similarity  were  not  responsible  for  the  observed  negative  regulation. 
Both  Box  1  and  Box  2  were  included  in  the  construct  pFO002E9,  which  was 
the  most  highly  expressed  construct  of  the  group.  Because  pFO002E9  was 
expressed  at  a  level  approximately  2.5  fold  higher  than  pF0007  we 
proposed  that  there  was  a  positive  element  located  between  -586  bp  and  • 
-730  bp.   The  only  obvious  candidate  sequence  present  in  the  region  was 
a  CCAAT  box  located  at  -718  bp .   We  cannot  conclude  that  this  sequence 
is  responsible  for  the  increase  in  the  transcriptional  level  of 
pF0002E9 ;  however,  it  has  been  well  documented  that  when  located  in  the 
proximal  promoter  region  of  many  genes  the  CCAAT  box  functions  in  the 
regulation  of  transcription  in  conjunction  with  other  DNA  sequences 


174 
(Dorn  et  al.,  1987;  McKnight  and  Kingsbury,  1982;  McKnight  et  al .  1984; 
McKnight  and  Tjian,  1987). 

Our  experiments  did  indicate  the  existence  of  a  negative  regulatory- 
element  when  we  examined  pF0002  (-1065  bp)  and  pF0002Dl  (-920  bp)  .   We 
demonstrated  that  these  constructs  were  expressed  at  a  significantly 
lower  level  in  the  short  term  transient  assays  than  pFO002E9  (-730  bp) . 
We  therefore  concluded  that  the  negative  regulatory  element  suggested 
by  previous  experiments  more  likely  resided  in  the  sequences  between 
-730  bp  and  -920  bp .   Dr.  Chrysogelos,  of  our  laboratory,  identified  a 
nuclease  sensitive  region  (DNase  I  and  SI)  located  between  -720  bp  to 
-820  bp  that  may  represent  a  protein/DNA  interaction  (Dr.  Susan 
Chrysogelos,  personal  communication).   The  sequence  of  this  region 
contains  a  stretch  from  -800  bp  to  -960  bp  was  very  A/T  rich  (70%). 

We  found  that  the  region  from  -580  to  -1010  bp  contained  two 
excellent  homologies  to  MARs  (matrix  attachment  regions)  as  described 
by  Gasser  and  Laemmli  (1987)  and  a  topoisomerase  II  site  (Sander  and 
Hsieh,  1985).   This  topoisomerase  II  site  was  confirmed  in  vitro  with 
purified  enzyme  by  Dr.  Tom  Rowe  (personal  communication).   Matrix 
attachment  sites  on  the  eukaryotic  chromosome  are  thought  to  function 
in  the  regulation  of  gene  expression  through  recognition  of  chromatin   • 
domains  and  attachment  to  the  nuclear  matrix  as  has  been  demonstrated 
for  a  number  of  genes,  including  Drosophila  histone  genes  (Gasser  and 
Laemmli,  1987).   Since  the  histone  genes  of  higher  eukaryotes  are 
clustered,  it  is  possible  that  they  may  be  divided  into  functional 
domains  on  chromatin  loops.   Cockerill  et  al .  (1986)  demonstrated  that 
MARs  were  approximately  200  bp  in  length  and  74%  A/T.-  They  also 
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demonstrated  possible  binding  sites  for  topoisomerase  II  as  described 
by  Sander  and  Hsieh  (1985).   The  upstream  region  of  the  F0108  gene  from 
-800  to  -960  is  70%  A/T  and  contains  at  least  one  confirmed 
topoisomerase  II  site.   This  evidence  suggests  that  binding  to  the 
nuclear  matrix  and  DNA  topology  may  function  in  the  regulation  of 
histone  H4  gene  expression. 

Additionally,  we  demonstrated  that  pF0002  and  pF0005  were  expressed 
at  nearly  the  same  level  in  transiently  transfected  cells  where  the 
DNA  was  presumably  episomal.   However,  in  stable  cell  lines,  pF0005  was 
expressed  a  significantly  higher  level  (-  3  fold)  than  pF0002 .   These 
results  suggest  that  the  state  of  the  DNA,  episomal  or  integrated, 
affects  the  function  of  certain  DNA  elements.   This  is  consistent  with 
our  hypothesis  that  attachment  to  the  nuclear  matrix  and  DNA  topology 
have  a  role  in  the  regulation  of  the  F0108  human  H4  histone  gene. 

We  have  also  noted  that  many  of  the  negative  regulatory  elements 
previously  described  are  located  at  a  considerable  distance  from  the 

gene  they  are  associated  with  and  this  is  consistent  with  our 

« 

hypothesis  (Baniahmad  et  al . ,  1987;  Laimins  et  al.,  1986). 

Additionally,  experiments  performed  by  Dr.  Pauli,  of  our  laboratory, 
suggested  that  histone  HI,  and  a  43  kd  nuclear  acidic  protein  (non- 
histone) bound  specifically  to  this  region  (Dr.  Urs  Pauli,  personal 
communication) .  It  has  been  previously  suggested  that  histone  HI  might 
be  a  general  negative  regulatory  factor  for  transcription  (Weintraub, 
1985). 

Another  possibility  that  we  have  considered  is  that  the  strings  of 
poly  A^  and  poly  T^   may  have  an  unusual  secondary  structure  in  the 
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upstream  region  of  the  H4  promoter.   Although  we  have  not  been  able  to 
perform  any  direct  analysis,  it  seems  possible  that  under  certain 
circumstances  this  segment  of  DNA  might  assume  a  "bent"  conformation  as 
described  by  Koo  et  al.  (1986)  and  Travers  (1987).   It  has  been 
elegantly  demonstrated  in  a  number  of  systems,  both  prokaryotic  and 
eukaryotic,  that  DNA  can  bend  intrinsically  if  the  necessary  bases  are 
present  or  can  bend  in  response  to  the  interaction  of  a  protein  (Salvo 
and  Grindley,  1987;  Koo  et  al . ,  1986).  Bending  of  DNA  requires  that 
there  be  proper  spacing  between  the  AA  dinucleotide  pairs  and  poly  A 
tracts.  This  spacing  corresponds  to  approximately  10  bp ,  or  a  single 
turn  of  the  helix  (Koo  et  al . ,  1986).  Several  of  the  poly  A^  tracts  in 
the  upstream  region  of  the  H4  gene  from  -945  bp  to  -880  bp  appear 
appropriately  spaced.   This  evidence  suggests  that  the  upstream  region 
of  the  H4  promoter  has  unusual  structure  and  might  be  responsible  for 
the  negative  regulation  we  demonstrated. 

We  had  previously  implicated  an  additional  positive  regulatory 
element  in  the  distal  region  of  the  H4  promoter.   Preliminary 
polyclonal  cell  line  experiments  suggested  that  the  pF0116  fragment 
(-6.0  to  -7.5  kb)  of  A  HHG41  was  able  to  enhance  the  level  of 
transcription  several  fold.   Helms  et  al .  (1987)  had  shown  that  this 
fragment  could  stimulate  CAT  gene  expression  4-5  times  in  HeLa  cells 
when  located  at  the  3'  end  of  the  gene.   Their  experiments  suggested 
that  the  element  might  have  the  properties  of  enhancer.   We  examined 
this  possibility  further  through  the  construction  of  a  number  of 
variant  enhancer  constructs.   We  made  pF0004,  pF0004R,  and  pF0006  as 
described  in  chapter  3  to  test  the  hypothesis  that  this  element  had  the 
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distance  and  orientation  independent  properties  of  an  enhancer  element. 
We  examined  a  number  of  monoclonal  cell  lines  prepared  with  each 
construct  and  established  that  the  pF0116  fragment  did  not  conclusively 
enhance  the  level  of  expression  of  the  H4  gene  in  mouse  C127  cells 
above  that  found  with  pFOlOSA  or  pFOlOSX.   The  construct  pF0004  was 
highly  expressed  in  a  nvimber  of  cell  lines;  however,  this  was 
apparently  the  result  of  high  copy  number  and  not  enhancement  of 
expression.   The  region  was  sequenced  by  Ken  Wright  and  Dr.  Urs  Pauli 
and  they  found  that  the  pF0116  fragment  exhibited  three  sequences  in 
the  500  bp  EcoRI  (-6.0  kb)/XbaI  (-6.5  kb)  section  with  strong 
similarity  to  the  consensus  core  sequence  of  the  SV40  and  Ig  heavy 
chain  enhancers  (Maniatis  et  al.,  1987;  Khoury  and  Gruss,  1983).   We 
can  only  speculate  that  the  lack  of  enhancer  activity  in  mouse  C127 
cells,  a  fibroblast  cell  line,  is  due  to  the  presence  of  negative 
regulatory  factors  or  the  absence  of  positive  factors  required  for 
activity.   Consistent  with  this  idea,  Wasylyk  and  Wasylyk  (1986) 
demonstrated  that  the  Ig  heavy  chain  enhancer  was  negatively  regulated 
in  fibroblasts,  but  transcription  could  be  stimulated  in  these  cells  if 
certain  sequences  were  deleted.   Finer  analysis  of  the  pF0116  fragment 
in  different  cell  types  should  reveal  if  this  element  is  regulated  in  • 
such  a  manner. 

Our  studies  of  the  pF0004  construct  allowed  us  to  associate 
repetitive  sequences  with  higher  copy  number  of  the  monoclonal  cell 
lines  and  with  specific  integration.   Most  of  our  constructs  integrated 
via  the  pathway  described  by  Perucho  et  al .  (1980).   There  was  a 
cointegrate  stage  followed  by  integration  at  a  limited  number  of  sites 
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in  the  cellular  chromatin.   We  found  that  minor  repeated  sequences, 
with  some  homology  to  the  Alu  repeat  (Collart  et  al.,  1985),  located 
in  the  distal  5'  flanking  of  pF0003  were  responsible  for  the  higher 
copy  number  in  these  cell  lines.  The  pF0003  constructs  had  a 
considerable  amount  of  tandem  integration  that  also  suggested  that  the 
minor  repeats  did  not  perturb  the  integration  pathway.  pF0004  was 
studied  in  more  detail,  and  we  concluded  that  the  high  copy  number  and 
heterogeneous  integration  observed  was  due  to  specific  integration  via 
the  Alu  repeat  located  in  the  pF0116  fragment  (-  -7.0  kb) .  It  has  been 
previously  demonstrated  that  the  human  histone  genes  are  interspersed 
with  various  repeated  sequences  and  often  flanked  by  Alu  repeats 
(Collart  et  al . ,  1985).   Perhaps  this  unusual  sequence  organization 
accounts  for  the  clustered  but  random  organizational  pattern  of  this 
family  of  genes. 

When  we  examined  the  expression  of  the  human  H4  histone  gene  in  the 
heterologous  C127  cells  we  found  that  only  a  limited  number  of  copies 
were  expressed.   In  addition,  we  found  that  as  copy  number  of  the  human 
H4  gene  increased,  the  expression  of  the  mouse  H4  gene  decreased.   This 
observation  has  been  made  previously  by  Capasso  and  Heintz  (1985)  in 
which  they  found  that  in  a  cell  line  with  a  high  copy  number  of  the 
human  H4  gene,  pHuH4  (120  copies),  the  endogenous  mouse  H4  genes  were 
completely  shut  off.   Our  results  are  similar  and  suggested  competition 
for  transcription  factors.  However,  when  the  human  H4  gene  was  present 
in  very  high  copy  number  (cell  lines  pF0004ml ,  ~  250  copies  and 
pFG003ml,  ~  139  copies)  neither  the  mouse  nor  human  H4  genes  were 
expressed  to  any  significant  extent.   We  feel  that  it  is  likely  that 


■■,"»-**   ■ '.  %. 


179 
the  regulatory  molecules  necessary  for  transcriptional  control  are 
limited,  and  when  spread  among  a  large  number  of  transcription  units, 
none  of  the  units  has  a  full  complement.   During  the  course  of  our 
studies  we  have   demonstrated  that  the  endogenous  mouse  H4  and 
transfected  human  H4  histone  genes  are  in  direct  competition  for  a 
limited  transcription  factor.   Genomic  sequencing  experiments  described 
in  chapter  4  demonstrated  that  we  could  detect  binding  in  vivo  of  a 
protein  to  the  Spl  site  (-100  bp)  located  in  Site  I.   We  were  never 
able  to  demonstrate  in  vivo  binding  to  Site  II  although  the  existence 
of  the  factors  in  mouse  cells  has  been  demonstrated  in  vitro  by  Andre 
van  Wijnen  (personal  communication) .   The  genomic  sequencing 
experiments  that  we  have  described  also  demonstrated  that  the  binding 
to  the  Spl  site  was  titratable  with  increased  copy  number  of  the  pF0003 
construct.   Binding  was  detected  in  cell  lines  pFO003M5  (-  15  copies) 
and  pFO003M6  (-  25  copies)  but  not  in  pFOOOSMl  (-  140  copies).   This 
suggested  that  the  interaction  at  the  Spl  site  was  titratable  even 
though  Spl  is  known  to  be  an  abundant  transcription  factor  (Dynan  and 
Tjian,  1985).   Perhaps  the  binding  of  Spl  to  this  sequence  is  dependent 
on  the  interaction  with  adjacent  histone  specific  transcription  factors 
that  we  have  been  unable  to  detect  with  genomic  sequencing.   Our 
inability  to  detect  the  protein/DNA  ineractions  at  Site  I  in  the  mouse 
cell  lines  may  simply  reflect  minor  differences  in  analogous  binding 
proteins  between  the  mouse  and  human  cells.  These  experiments  support 
the  contention  that  the  H4  histone  genes  in  the  mouse  cell  lines 
directly  competed  for  a  limiting  transcription  factor  or  factors  that 
function  in  the  regulation  of  H4  histone  gene  expression  in  vivo . 
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In  conclusion,  our  studies  have  described  the  functional  role  of 
transcription  factor  interactions  seen  at  both  Site  I  and  Site  II.       . 
Site  II  was  required  for  initiation  of  transcription  and  Site  I 
augmented  the  level  of  transcription  in  a  positive  manner.   We 
concluded  that  in  mouse  C127  cells,  an  enhancer-like  element  in  the  far 
upstream  region  of  the  H4  promoter  was  not  active.   The  possibility  of 
a  negative  regulatory  element  was  investigated  and  our  results  suggest 
that  the  sequences  upstream  of  -730  bp  are  responsible.   The  sequences 
from  -800  to  -960  bp  were  shown  to  be  70%  A/T  and  contain  putative 
nuclear  matrix  attachment  and  topoisomerase  II  sites.   The  results  of 
our  studies  suggest  that  the  promoter  of  the  F0108  H4  histone  gene,  as 
defined  in  vivo,  may  be  more  extensive  than  previously  thought.  Further 
deletion  analysis  and  investigation  will  describe  the  specific 
sequences  responsible  for  the  transcriptional  regulation  of  this  gene 
in  vivo. 


APPENDIX  A 
SAMPLE  COPY  NUMBER  CALCULATION 

To  determine  the  copy  number  of  each  monoclonal  cell  line  the  human 
H4  gene  signal,  mouse  18S  ribosomal  signal,  and  plasmid  DNA  standards 
were  subject  to  densitometric  analysis  as  described  in  Chapter  II.  Once 
completed,  all  the  copy  number  blots  were  compared  to  each  other 
visually  and  on  each  blot  of  equivalent  length  exposure  an  18S 
ribosomal  band  was  as  the  standard  for  that  experiment.  This  decision 
involved  comparison  of  many  films  and  the  photographs  of  the  gels  prior 
to  transfer.  Every  effort  was  made  to  ensure  that  equivalent  standards 
were  picked  from  the  different  experiments. 

An  example  calculation  is  given  below  for  the  cell  line  pFOl08Am2 . 

The  densitometric  values  determined  are  listed  below  for  each  variable: 

Ribosomal  Standard  (Rstd)  =  6309  densitometry  units  (DU) . 

pFOlOSA  Ribosomal  value  (SRstd)  =  5891  DU. 

pFOlOSA  human  value  (HV)  =  5138  DU. 

10  pg  (1.3  copies/diploid  genome)  control  =  820  DU  =  630  DU/copy. 

50  pg  (6.5  copies/diploid  genome)  control  =  1923  DU  =  295  DU/copy. 

50  pg  (13  copies/diploid  genome)  control  =  3938  DU  =  302  DU/copy. 


pFO108Am2  copy  number  =  Rstd   x  HV  =   6309   x   5138   =   19 

SRstd 5891 


plasmid  control        295 
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APPENDIX  B 

SAMPLE  CALCULATION  OF  HUMAN  H4  EXPRESSION 

This  example  serves  to  illustrate  how  the  mouse  H4  internal  control 

and  the  plasmid  DNA  markers  were  used  to  calculate  the  level  of  human 

H4  expression.  The  example  presented  here  is  for  the  cell  line 

pFO005m6.  Pertinent  numbers  are  listed  and  then  the  calculation  is 

done.  Because  of  the  differences  in  the  intensities  of  SI  protected 

fragments  the  mouse  and  human  H4  values  had  to  be  determined  from 

different  length  exposures . 

Human  H4  densitometry  value  (17  day  exposure)  =  2840  DU. 
Mouse  H4  densitometry  value  (16  hour  exposure)  =  1397  DU. 
pBR322/HpaII  marker  band  #1  (16  hour  exposure)  =  1701  DU. 
pBR322/HpaII  marker  band  #1,  1:4  dilution  (17  day  exposure)  =  5510  DU. 

Calculation: 

1)  5510  X  4  =  20040  units. 

2)  20040  /  1701  =  12.95  (the  fold  difference  from  16  hours  to  17  days 
exposure) . 

3)  2840  /  [1397  x  12.95]  =  0.16  =  human/mouse  expression  ratio. 
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APPENDIX  C 
TABLE  OF  CONSTRUCTS 


Construct 


5 'sequence  (bp) 


Vector 


Comments 


J67 
J56 

J50 

K8 

L14 

pFOlGSA 

pFOlOSX 

pFOOOS 

pF0007 

pFO002E9 

pF0002Dl 

pF0002 

pFOOOl 

pF0003 

pF0004 

pFO004R 

pF0006 


-47  pBR322 

-73  pBR322 

-100  pBR322 

-155  pBR322 

-185  pBR322 

-215  pBR322 

-215  pUC19 

-417  pUC13 

-586  pUC19 

-730  pUC19 

-920  pUCl9 

-1065  pUCB 
-215/-586  to  -3300  pBR322 

-6500  pUC13 

-215  +  pF0116  pUC8 

-215  -I-  pF0116  pUC8 

-215  +  500  bp  pUC19 
EcoRI/Xbal  fragment 
from  pF0116 


Bal  31  deletion  of  pFOlOSA 

Bal  31  deletion  of  pFOlOSA 

Bal  31  deletion  of  pFOlOBA 

Bal  31  deletion  of  pFOl08A 

Bal  31  deletion  of  pFOl08A 

3'  site  =  Hindlll   (-f-1877) 

3'  site  =  Xbal  (-f-1107) 

3'  site  =  PstI  (+1677) 

3'  site  =  Xbal 

3'  site  =  Xbal 

3'  site  =  PstI 

3'  site  =  PstI 

Internal  deletion  of  pF0919 

3'  site  =  Xbal 

same  orientation  as  genomic 

opposite  orientation 

same  orientation  as  genomic 


pF0005BS  -417 

pFO005BSdel2-10     -335 


Bluescript  M13+ 
Bluescript  M13+ 


Exonuclease  III 
deletion 


pFO005BSdel2-6 


•285 


Bluescript  M13+ 


Exonuclease  III 
deletion 
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